Skip to content

Using Neptune with pandas#

Tip

This guide shows how to log pandas DataFrames to Neptune.

For the best display experience, however, we recommend uploading tabular data as CSV:

run["test/preds"].upload("path/to/test_preds.csv")

This method lets you browse the data in interactive table format in the Neptune app. See example →

pandas is a popular open-source data analysis and manipulation tool. With Neptune, you can log and visualize pandas DataFrames.

Custom dashboard displaying metadata logged with pandas

See example in Neptune 

Before you start#

  • Sign up at neptune.ai/register.
  • Create a project for storing your metadata.
  • Have pandas and Neptune installed:

    pip install -U pandas neptune
    
    conda install pandas neptune
    
Upgrading with neptune-client already installed

Important: To smoothly upgrade to the 1.0 version of the Neptune client library, first uninstall the neptune-client library and then install neptune.

pip uninstall neptune-client
pip install neptune

pandas logging example#

  1. Import Neptune and start a run:

    import neptune
    
    run = neptune.init_run()  # (1)!
    
    1. If you haven't set up your credentials, you can log anonymously:

      ``` py neptune.init_run( api_token=neptune.ANONYMOUS_API_TOKEN, project="common/quickstarts", )

  2. Create a pandas DataFrame object:

    import pandas as pd
    
    iris_df = pd.read_csv(
        "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv",
        nrows=100,
    )
    
  3. Log the DataFrame to Neptune:

    from neptune.types import File
    
    run["data/iris-df"].upload(File.as_html(iris_df))
    
  4. To stop the connection to Neptune and sync all data, call the stop() method:

    run.stop()
    
  5. To open the run, click the Neptune link in the console output.

    Example link: https://app.neptune.ai/o/common/org/showroom/e/SHOW-102/metadata

Result

The resulting dataframe is logged as an HTML object.

You can view it in the All metadata section.

Converting the DataFrame to CSV#

You can save the DataFrame as a CSV and then upload it to Neptune with the upload() method. This lets you view and sort the data in Neptune's interactive table format.

You can also save the file to a CSV buffer, then upload the streaming buffer using the from_stream() method:

from io import StringIO
from neptune.types import File

csv_buffer = StringIO()
df.to_csv(csv_buffer, index=False)
run["df_as_csv_from_buffer"].upload(File.from_stream(csv_buffer, extension="csv"))

See example in Neptune