Guide#

Welcome to LaminDB! 👋

Curate, store, track, query, integrate, and learn from biological data.

LaminDB is a distributed data management system in which users collaborate on DB instances.

Each LaminDB instance is a data lakehouse that manages indexed object storage (local directories, S3, GCP) with a SQL query engine (SQLite, Postgres, and soon, BigQuery).

This is analogous to how developers collaborate on code in repositories, but unlike git and dvc, LaminDB is queryable by entities.

Warning

Public beta: Currently only recommended for collaborators as we still make breaking changes.

Features#

LaminDB comes with

data lineage and edit history
tracking of interactive notebooks
knowledge-managed biological entities for typing and lookups
configurable schema modules

LaminDB is built on open-source Python packages.

Getting started#

Quick setup on the command line (see Initialize a LaminDB instance for advanced setup guide):

Install via pip install lamindb
Sign up via lndb signup <email> and confirm the sign-up email
Log in via lndb login <handle>

Tip

Each page in this guide is a Jupyter Notebook, which you can download here.
You can run these notebooks in hosted versions of JupyterLab, e.g., Saturn Cloud, Google Vertex AI, and others.
We recommend using JupyterLab for best notebook tracking experience.

📬 Reach out to report issues, learn about data modules that connect your assays, pipelines & workflows within our data platform enterprise plan.