How AUSSDA supports the Dataverse Community

21.07.2022

We at AUSSDA are proud of being an active part of the Dataverse Community. A short update on pyDataverse and new projects

Our DevOp Stefan Kasberger has been strongly involved in the development of Python tools for Dataverse and has created a Python module – pyDataverse – that gives access to the Dataverse APIs, enables working with datasets and datafiles and allows exploring various Dataverses.

Different use-cases of this module, which was developed within the SSHOC project, include data migrations, automation, testing, microservices, as well as data science. Through new features and new releases, we ensure that pyDataverse keeps evolving.  The project has been moved to GDCC and Stefan is also working on an async feature prototype (view the feature branch here). Most importantly, we are happy that pyDataverse is a helpful tool that many members of the Dataverse community appreciate and work with: So far, more than 100.000 datasets and datafiles were migrated using the module.

Dataverse_tests

pyDataverse also built the ground for additional projects. For those who want to test the operational requirements of their Dataverse installation, there is a brand new tool that helps with exactly that: Check out dataverse_tests. Whether you want to test a fresh, customized Dataverse installation or the configuration after an upgrade – this tool was made to make these things easier for DevOps and developers. It is also meant for frequent testing during operation. Besides the tests, it offers a CLI for common test-workflow steps, such as mass download of data, create a test data collection and clean up after your DevOp activities.

The open source tests are written in Python with pytest, requests and Selenium. They are well documented and easy to adapt and extend. Dataverse_tests uses dataverse_testdata, a collection of high quality metadata we created for testing purposes.

The first release of dataverse_tests is out now – you are welcome to try it out and give feedback to the tool!

Dataverse Community Meeting

After 2019 and 2021, Stefan also participated in this year’s Dataverse Community Meeting. His presentation “pyDataverse: Doing Tests, Data Migrations and Other API Stuff” highlighted his latest work on the projects mentioned above. As an extra service, Stefan recorded two screencasts to make it easier for people to follow: One shows how to log in to the Dataverse frontend during tests, the other one demonstrates the data workflow. Check them out!

Screenshot vom Login in Dataverse

Test Login Dataverse. Screenshot: Kasberger.

Using pyDataverse for Data Science inside a Jupyter Notebook. Screenshot: Kasberger.

Utils workflow. Screenshot: Kasberger.

SSHOC Logo

pyDataverse was funded by SSHOC.