|
6 | 6 |
|
7 | 7 |  |
8 | 8 |
|
9 | | -**Dataverse Rust** is a client library and command-line interface (CLI) for interacting with |
10 | | -the [Dataverse API](https://guides.dataverse.org/en/latest/api/). This project is in active development and not yet |
11 | | -feature complete. |
| 9 | +A comprehensive Rust library and command-line interface for interacting with the [Dataverse API](https://guides.dataverse.org/en/latest/api/). Build robust data repository workflows with type-safe, asynchronous operations. |
12 | 10 |
|
13 | | -## Features |
14 | | - |
15 | | -Current capabilities include: |
16 | | - |
17 | | -### Collection Management |
18 | | - |
19 | | -- **Create**: Create a new collection within the Dataverse. |
20 | | -- **Delete**: Remove an existing collection. |
21 | | -- **Publish**: Publish a collection to make it publicly available. |
22 | | -- **Contents**: Retrieve the contents of a collection. |
23 | | - |
24 | | -### General Information |
25 | | - |
26 | | -- **Version**: Retrieve the current version of the Dataverse instance. |
| 11 | +> **Note:** This project is under active development. While core functionality is stable, the API may evolve before the 1.0 release. |
27 | 12 |
|
28 | | -### Dataset Management |
| 13 | +## Why Dataverse Rust? |
29 | 14 |
|
30 | | -- **Get**: Fetch details of a specific dataset. |
31 | | -- **Create**: Create a new dataset within a collection. |
32 | | -- **Edit**: Modify an existing dataset. |
33 | | -- **Delete**: Delete an unpublished dataset. |
34 | | -- **Upload**: Upload a file to a dataset. |
35 | | -- **Publish**: Publish a dataset to make it publicly available. |
36 | | -- **Link**: Link datasets to other collections. |
| 15 | +- **🚀 High Performance** - Built with async/await using Tokio and Reqwest for efficient concurrent operations |
| 16 | +- **🔒 Type Safety** - Leverage Rust's type system to catch errors at compile time |
| 17 | +- **⚡ Direct Upload** - Parallel batch uploads for fast file transfers to S3-compatible storage |
| 18 | +- **🎯 Dual Interface** - Use as a library in your Rust projects or as a standalone CLI tool |
| 19 | +- **🔐 Secure Authentication** - Multiple auth methods including system keyring integration for credential storage |
| 20 | +- **📦 Flexible Configuration** - JSON and YAML support for all configuration files |
37 | 21 |
|
38 | | -### File Management |
| 22 | +## Features |
39 | 23 |
|
40 | | -- **Replace**: Replace existing files in a dataset. |
| 24 | +- **📚 Collections** - Create, publish, and manage Dataverse collections with hierarchical organization support |
| 25 | +- **📊 Datasets** - Full dataset lifecycle management including creation, metadata editing, versioning, publishing, linking, and deletion. Support for dataset locks and review workflows |
| 26 | +- **📁 Files** - Upload files via standard or direct upload (with parallel batch support), replace existing files, download files and complete datasets, and manage file metadata |
| 27 | +- **🔍 Search** - Query datasets and files across your Dataverse instance with flexible search parameters |
| 28 | +- **🛠️ Administration** - Manage storage drivers, configure external tools, and perform administrative operations |
| 29 | +- **ℹ️ Instance Information** - Retrieve version information and available metadata exporters from your Dataverse instance |
41 | 30 |
|
42 | 31 | ## Installation |
43 | 32 |
|
44 | | -**Command line** |
| 33 | +### CLI Installation |
| 34 | + |
| 35 | +Install the command-line tool directly from the repository: |
45 | 36 |
|
46 | 37 | ```bash |
47 | 38 | cargo install --git https://github.com/JR-1991/rust-dataverse.git |
48 | 39 | ``` |
49 | 40 |
|
50 | | -**Cargo.toml** |
| 41 | +### Library Installation |
51 | 42 |
|
52 | | -Please note, this crate is not yet published on crates.io. You can add it to your `Cargo.toml` file by pointing to the |
53 | | -GitHub repository. |
| 43 | +Add to your `Cargo.toml`: |
54 | 44 |
|
55 | 45 | ```toml |
56 | 46 | [dependencies] |
57 | 47 | dataverse = { git = "https://github.com/JR-1991/rust-dataverse" } |
58 | 48 | ``` |
59 | 49 |
|
| 50 | +> **Note:** Not yet published on crates.io. Pre-1.0 releases will be available soon. |
| 51 | +
|
60 | 52 | ## Usage |
61 | 53 |
|
62 | | -### Command line |
| 54 | +### Library Usage |
| 55 | + |
| 56 | +The library provides an async API built on `tokio` and `reqwest`. Import the prelude for common types: |
| 57 | + |
| 58 | +```rust |
| 59 | +use dataverse::prelude::*; |
| 60 | + |
| 61 | +#[tokio::main] |
| 62 | +async fn main() -> Result<(), Box<dyn std::error::Error>> { |
| 63 | + // Initialize client |
| 64 | + let client = BaseClient::new( |
| 65 | + "https://demo.dataverse.org", |
| 66 | + Some("your-api-token") |
| 67 | + )?; |
| 68 | + |
| 69 | + // Get instance version |
| 70 | + let version = info::get_version(&client).await?; |
| 71 | + println!("Dataverse version: {}", version.data.unwrap()); |
| 72 | + |
| 73 | + // Create a dataset |
| 74 | + let dataset_body = dataset::create::DatasetCreateBody { |
| 75 | + // ... configure metadata |
| 76 | + ..Default::default() |
| 77 | + }; |
| 78 | + let response = dataset::create_dataset(&client, "root", dataset_body).await?; |
| 79 | + |
| 80 | + // Upload a file |
| 81 | + let file = UploadFile::from("path/to/file.csv"); |
| 82 | + let identifier = Identifier::PersistentId("doi:10.5072/FK2/ABCDEF".to_string()); |
| 83 | + dataset::upload_file_to_dataset(&client, identifier, file, None, None).await?; |
| 84 | + |
| 85 | + Ok(()) |
| 86 | +} |
| 87 | +``` |
63 | 88 |
|
64 | | -Before you can use the command line tool, you need to set the `DVCLI_URL` and `DVCLI_TOKEN` environment variables. You |
65 | | -can do this by adding the following lines to your `.bashrc` or `.bash_profile` file: |
| 89 | +**Key Library Modules:** |
66 | 90 |
|
67 | | -```bash |
68 | | -export DVCLI_URL="https://your.dataverse.url" |
69 | | -export DVCLI_TOKEN="your_token_here" |
70 | | -``` |
| 91 | +- `dataverse::client::BaseClient` - HTTP client for API interactions |
| 92 | +- `dataverse::native_api::collection` - Collection operations |
| 93 | +- `dataverse::native_api::dataset` - Dataset operations |
| 94 | +- `dataverse::native_api::file` - File operations |
| 95 | +- `dataverse::native_api::admin` - Administrative operations |
| 96 | +- `dataverse::search_api` - Search functionality |
| 97 | +- `dataverse::direct_upload` - Direct upload with parallel batch support |
| 98 | +- `dataverse::data_access` - File and dataset downloads |
| 99 | + |
| 100 | +### CLI Usage |
| 101 | + |
| 102 | +The CLI provides three flexible authentication methods: |
71 | 103 |
|
72 | | -The command line tool in organized in subcommands. To see a list of available subcommands, run: |
| 104 | +#### 1. Profile-Based (Recommended) |
| 105 | + |
| 106 | +Store credentials securely in your system keyring: |
73 | 107 |
|
74 | 108 | ```bash |
75 | | -dvcli --help |
| 109 | +# Create a profile |
| 110 | +dvcli auth set --name production --url https://dataverse.org --token your-api-token |
| 111 | + |
| 112 | +# Use the profile |
| 113 | +dvcli --profile production info version |
76 | 114 | ``` |
77 | 115 |
|
78 | | -To see help for a specific subcommand, run: |
| 116 | +#### 2. Environment Variables |
| 117 | + |
| 118 | +Set environment variables for automatic authentication: |
79 | 119 |
|
80 | 120 | ```bash |
81 | | -dvcli <subcommand> --help |
| 121 | +export DVCLI_URL="https://demo.dataverse.org" |
| 122 | +export DVCLI_TOKEN="your-api-token" |
| 123 | + |
| 124 | +dvcli dataset meta doi:10.5072/FK2/ABC123 |
82 | 125 | ``` |
83 | 126 |
|
84 | | -**Example** |
| 127 | +#### 3. Interactive Mode |
85 | 128 |
|
86 | | -In this examples we will demonstrate how to retrieve the version of the Dataverse instance. |
| 129 | +If neither profile nor environment variables are set, the CLI will prompt for credentials: |
87 | 130 |
|
88 | 131 | ```bash |
89 | 132 | dvcli info version |
| 133 | +# Prompts for URL and token |
90 | 134 | ``` |
91 | 135 |
|
92 | | -The output will be similar to: |
| 136 | +**Common CLI Operations:** |
| 137 | + |
| 138 | +> **Note:** Configuration files can be provided in both JSON and YAML formats. |
93 | 139 |
|
94 | 140 | ```bash |
95 | | -Calling: http://localhost:8080/api/info/version |
96 | | -└── 🎉 Success! - Received the following response: |
| 141 | +# Get help |
| 142 | +dvcli --help |
| 143 | +dvcli dataset --help |
97 | 144 |
|
98 | | -{ |
99 | | - "version": "6.2" |
100 | | -} |
| 145 | +# Collections |
| 146 | +dvcli collection create --parent root --body collection.json |
| 147 | +dvcli collection publish my-collection |
| 148 | + |
| 149 | +# Datasets |
| 150 | +dvcli dataset create --collection root --body dataset.json # or dataset.yaml |
| 151 | +dvcli dataset upload --id doi:10.5072/FK2/ABC123 data.csv |
| 152 | +dvcli dataset publish doi:10.5072/FK2/ABC123 |
| 153 | + |
| 154 | +# Direct upload (faster for large files) |
| 155 | +dvcli dataset direct-upload --id doi:10.5072/FK2/ABC123 --parallel 5 file1.csv file2.csv |
| 156 | + |
| 157 | +# Files |
| 158 | +dvcli file replace --id 12345 --path new-file.csv |
| 159 | +dvcli file download file-pid.txt --path ./downloads/ |
| 160 | + |
| 161 | +# Search |
| 162 | +dvcli search -q "climate change" -t dataset -t file |
| 163 | + |
| 164 | +# Admin |
| 165 | +dvcli admin storage-drivers |
| 166 | +dvcli admin add-external-tool tool-manifest.json |
101 | 167 | ``` |
102 | 168 |
|
103 | 169 | ## Examples |
104 | 170 |
|
105 | | -We have provided an example in the `examples` directory. These examples demonstrate how to use the client to perform |
106 | | -various operations. |
| 171 | +Complete workflow examples are available in the [`examples/`](examples/) directory: |
| 172 | + |
| 173 | +- **[create-upload-publish](examples/create-upload-publish)** - End-to-end workflow demonstrating collection and dataset creation, file upload, and publishing using shell scripts and the CLI. |
| 174 | + |
| 175 | +Besides these examples, you can also find some recipes in the [Dataverse Recipes](https://github.com/gdcc/dataverse-recipes/tree/main/dvcli) repository, which cover most of the functionality of the CLI. |
| 176 | + |
| 177 | +## Development |
| 178 | + |
| 179 | +### Running Tests |
| 180 | + |
| 181 | +Tests require a running Dataverse instance. We provide a convenient test script that handles infrastructure setup: |
| 182 | + |
| 183 | +```bash |
| 184 | +# Run all tests (starts Docker containers automatically) |
| 185 | +./run-tests.sh |
| 186 | + |
| 187 | +# Run a specific test |
| 188 | +./run-tests.sh test_create_dataset |
| 189 | +``` |
| 190 | + |
| 191 | +The script automatically: |
| 192 | + |
| 193 | +- Starts Dataverse with PostgreSQL and Solr via Docker Compose |
| 194 | +- Waits for services to be ready |
| 195 | +- Configures environment variables |
| 196 | +- Executes the test suite |
| 197 | + |
| 198 | +Docker containers remain running after tests complete for faster subsequent runs. View logs with `docker logs dataverse` if you encounter issues. |
| 199 | + |
| 200 | +### Manual Test Setup |
| 201 | + |
| 202 | +For granular control during development: |
| 203 | + |
| 204 | +```bash |
| 205 | +# Start infrastructure |
| 206 | +docker compose -f ./docker/docker-compose-base.yml --env-file local-test.env up -d |
| 207 | + |
| 208 | +# Configure environment |
| 209 | +export BASE_URL=http://localhost:8080 |
| 210 | +export DV_VERSION=6.2 |
| 211 | +export $(grep "API_TOKEN" "dv/bootstrap.exposed.env") |
| 212 | +export API_TOKEN_SUPERUSER=$API_TOKEN |
| 213 | + |
| 214 | +# Run tests |
| 215 | +cargo test |
| 216 | +cargo test -- --nocapture # with output |
| 217 | +cargo test test_name # specific test |
| 218 | +cargo test collection:: # module tests |
| 219 | +``` |
| 220 | + |
| 221 | +## Contributing |
| 222 | + |
| 223 | +Contributions are welcome! Whether you're fixing bugs, adding features, or improving documentation, your help is appreciated. Please feel free to open issues or submit pull requests on [GitHub](https://github.com/JR-1991/rust-dataverse). |
| 224 | + |
| 225 | +## Community |
107 | 226 |
|
108 | | -* [`examples/create-upload-publish`](examples/create-upload-publish) - Demonstrates how to create a collection, dataset, |
109 | | - upload a file, and publish the collection and dataset. |
| 227 | +Join the conversation on the [Dataverse Zulip Channel](https://dataverse.zulipchat.com)! Connect with other developers, get help, share ideas, and discuss the future of Rust clients for Dataverse. |
110 | 228 |
|
111 | | -## ToDo's |
| 229 | +## License |
112 | 230 |
|
113 | | -- [ ] Implement remaining API endpoints |
114 | | -- [x] Write unit and integration tests |
115 | | -- [x] Asynchronous support using `tokio` |
116 | | -- [x] Documentation |
117 | | -- [ ] Publish on crates.io |
118 | | -- [x] Continuous integration |
119 | | -- [ ] Validate before upload using `/api/dataverses/$ID/validateDatasetJson` |
| 231 | +This project is licensed under the MIT License - see the [License.md](License.md) file for details. |
0 commit comments