scitacean.testing.strategies.datasets#

scitacean.testing.strategies.datasets(type=None, for_upload=False, **fields)[source]#

A strategy for generating datasets.

This strategy can populate all dataset fields. However, there are some limitations:

  • Some complex models may be uninitialized, e.g., lifecycle.

  • Fields of type dict only have string values, e.g., meta will only be a dict[str, str] instead of the broader value types allowed by SciCat.

  • The dataset has no files.

Parameters:
  • type (DatasetType | None, default: None) – The type of dataset to generate. If None, a random dataset type will be chosen.

  • for_upload (bool, default: False) – If True, the dataset can be uploaded because only writable fields will be set. Otherwise, read-only fields may be set as well.

  • fields (Any) – Concrete values or specific search strategies for dataset fields.

Returns:

SearchStrategy[Dataset] – Datasets.

Examples

To draw arbitrary datasets, use

from hypothesis import given
from scitacean.testing import strategies as sst

@given(ds=sst.datasets())
def test_dataset(ds):
    # use ds

Limit to derived datasets that can be uploaded:

@given(ds=sst.datasets(type="derived", for_upload=True))
def test_derived_upload(ds):
    client = ...
    client.upload_new_dataset_now(ds)

Fields can be fixed to specific values or generated from specific strategies. All other fields are generated as normal.

@given(ds=sst.datasets(
    owner="librarian",
    owner_group=st.sampled_from(("library", "faculty"))
))
def test_dataset_owner(ds):
    assert ds.owner == "librarian"
    assert ds.owner_group in ("library", "faculty")
    # other tests

It is also possible to fix read-only fields:

@given(ds=sst.datasets(pid=PID.parse("abcd-12")))
def test_dataset_fixed_pid(ds):
    assert ds.pid.prefix is None
    # other tests