I’m happy to share that in the spirit of accessibility, we have published a Python package on PyPI. It is an SDK to access the Cumul.io Core API. Python is one of the most popular programming languages for data science. We hope that this will make it easier for teams whose code base is in Python to adopt Cumul.io as a part of their tech stack for data analysis.
The package allows you to do everything you’re used to with the Core API; such as creating Cumul.io clients, authorization tokens, datasets, columns etc. So now, if you have a data analysis pipeline written in Python, you’ll be able to feed this data to Cumul.io by accessing Cumul.io straight within your original code stack without having to worry about a middle layer.
To install you will need Python >= 3.7 and then simply run pip install cumulio. Here’s an example of what you might do then in your codebase:
You can create a dataset:
Update the dataset description:
And so on..
All of the Cumul.io SDKs for the Core API are open source projects on GitHub. Given that this one is new, anyone using it is welcome to create issues or submit recommendations to the repo. We will take care of it as Cumul.io and the Cumul.io community together.
I will soon be publishing a full demo project using the Python SDK as a guide for anyone who wants to use it. This was also my first time publishing a package on PyPI. So for anyone interested I’ve listed some tools and resources below that helped me that I found quite helpful. I’ll also share a more in detail article explaining how I built the package. If you note some things I could have done better or tools you think would have been more appropriate, I’m interested, tell me!
Poetry for dependency management, packaging and publishing
poetry shell for virtual environment while developing
This video by Black Hills Information Security about Python package management and various options. It’s quite long but very informative and ultimately helped me decide that I would be using Poetry 😉