Great Expectations OSS integration guide#
Great Expectations OSS (GX OSS) is an open-source tool for validating, documenting, and monitoring your data. With Neptune, you can:
- Log GX OSS's configurations
- Log validation results and display them in the Neptune app
- Upload GX OSS's rich HTML reports and interact with them in the Neptune app
See example in Neptune  Code examples 
Before you start#
- Sign up at neptune.ai/register.
-
Create a project for storing your metadata.
-
Have Neptune and GX OSS installed:
To see how the integration works without setting up your environment, run the example in Colab .
Quickstart#
To log metadata to Neptune, create a run
object in your script. The run
objects contain the metadata that you want to track, plus the automatically logged system metrics.
To log metadata to Neptune:
-
Save your Neptune API token and full project name as environment variables.
How do I save my credentials as environment variables?
Set your Neptune API token and full project name to the
NEPTUNE_API_TOKEN
andNEPTUNE_PROJECT
environment variables, respectively.You can also navigate to Settings → Edit the system environment variables and add the variables there.
To find your credentials:
- API token: In the bottom-left corner of the Neptune app, expand your user menu and select Get your API token. If you need the token of a service account, go to the workspace or project settings and enter the Service accounts settings.
- Project name: Your full project name has the form
workspace-name/project-name
. You can copy it from the project menu ( → Details & privacy).
If you're working in Google Colab, you can set your credentials with the os and getpass libraries:
-
In your script, import Neptune:
-
Initialize a Neptune run:
You can specify additional run parameters, such as tags or a description. For a full list of options, see the API reference.
-
Log the GX OSS metadata. For example, to log a Data Context configuration under the
gx/context/config
namespace of a run, use:For details, see Logging examples.
-
To stop the connection to Neptune and sync all data, call the
stop()
method: -
Run your script.
To open the run and watch the logging live, click the Neptune link that appears in the console output.
Example link: https://app.neptune.ai/o/showcase/org/great-expectations/e/GX-1
Use the Neptune app to visualize, compare, and organize your logged metadata. For details, see Experiments.
Logging examples#
You can organize the logged metadata into a folder-like structure with the namespaces and fields of a run
object. For details, see Namespaces and fields.
Tip
To view the logging examples in the Neptune app, check the GX metadata dashboard.
Log a Data Context configuration#
To log a Data Context configuration under the gx/context/config
namespace of a run, use:
Log a Checkpoint configuration#
The Checkpoint configuration contains unsupported values such as lists. To convert all the lists to strings, use the stringify_unsupported()
method.
To log a Checkpoint config under the gx/checkpoint/config
namespace of a run, use:
from neptune.utils import stringify_unsupported
run["gx/checkpoint/config"] = stringify_unsupported(checkpoint.config.to_json_dict())
Log Expectations#
To log and organize your Expectations, use:
expectation_suite = validator.get_expectation_suite().to_json_dict()
run["gx/meta"] = expectation_suite["meta"]
# Log the Expectation Suite name to the `gx/expectations/expectations_suite_name`
# field of a run:
run["gx/expectations/expectations_suite_name"] = expectation_suite[
"expectation_suite_name"
]
# Create a numbered folder for each Expectation in the `gx/expectations` namespace:
for idx, expectation in enumerate(expectation_suite["expectations"]):
run["gx/expectations"][idx] = expectation
Log validation results#
By saving validation results as a dictionary, you can access them programmatically and use in your CI/CD pipelines:
results_dict = checkpoint_result.list_validation_results()[0].to_json_dict()
run["gx/validations/json"] = results_dict
for idx, result in enumerate(results_dict["results"]):
run["gx/validations/json/results"][idx] = result
Upload HTML reports#
You can also upload the rich HTML reports to Neptune and then interact with them in the app:
-
HTML reports are available only for
FileDataContext
. Start by converting thecontext
toFileDataContext
: -
Fetch the
local_site_path
of a Data Context: -
To log the HTML Expectations and Validation reports, use:
-
To view the uploaded reports in the Neptune app, navigate to All metadata, or create a custom dashboard with a File preview widget.