Guide#
Welcome to LaminDB! ๐
Curate, store, track, query, integrate, and learn from biological data.
LaminDB is a distributed data management system in which users collaborate on DB instances.
Each LaminDB instance is a data lakehouse that manages indexed object storage (local directories, S3, GCP) with a SQL query engine (SQLite, Postgres, and soon, BigQuery).
This is analogous to how developers collaborate on code in repositories, but unlike git and dvc, LaminDB is queryable by entities.
Warning
Public beta: Currently only recommended for collaborators as we still make breaking changes.
Features#
LaminDB comes with
data lineage and edit history
tracking of interactive notebooks
knowledge-managed biological entities for typing and lookups
configurable schema modules
LaminDB is built on open-source Python packages.
Getting started#
Quick setup on the command line (see Initialize a LaminDB instance for advanced setup guide):
Install via
pip install lamindb
Sign up via
lndb signup <email>
and confirm the sign-up emailLog in via
lndb login <handle>
Tip
Each page in this guide is a Jupyter Notebook, which you can download here.
You can run these notebooks in hosted versions of JupyterLab, e.g., Saturn Cloud, Google Vertex AI, and others.
We recommend using JupyterLab for best notebook tracking experience.
๐ฌ Reach out to report issues, learn about data modules that connect your assays, pipelines & workflows within our data platform enterprise plan.