Skip to content

Commit 2c44159

Browse files
authored
Merge pull request #11 from gdcc/test-guidelines
Add test script and docker setup
2 parents 5ba4417 + 2796175 commit 2c44159

File tree

12 files changed

+666
-118
lines changed

12 files changed

+666
-118
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,6 @@ src/bin/dvtest.rs
33
.idea/*
44
.DS_Store
55
*/**/.DS_Store
6+
7+
dv/
8+
solr/

Readme.md

Lines changed: 174 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -6,114 +6,226 @@
66

77
![Build Status](https://github.com/JR-1991/rust-dataverse/actions/workflows/tests.yml/badge.svg)
88

9-
**Dataverse Rust** is a client library and command-line interface (CLI) for interacting with
10-
the [Dataverse API](https://guides.dataverse.org/en/latest/api/). This project is in active development and not yet
11-
feature complete.
9+
A comprehensive Rust library and command-line interface for interacting with the [Dataverse API](https://guides.dataverse.org/en/latest/api/). Build robust data repository workflows with type-safe, asynchronous operations.
1210

13-
## Features
14-
15-
Current capabilities include:
16-
17-
### Collection Management
18-
19-
- **Create**: Create a new collection within the Dataverse.
20-
- **Delete**: Remove an existing collection.
21-
- **Publish**: Publish a collection to make it publicly available.
22-
- **Contents**: Retrieve the contents of a collection.
23-
24-
### General Information
25-
26-
- **Version**: Retrieve the current version of the Dataverse instance.
11+
> **Note:** This project is under active development. While core functionality is stable, the API may evolve before the 1.0 release.
2712
28-
### Dataset Management
13+
## Why Dataverse Rust?
2914

30-
- **Get**: Fetch details of a specific dataset.
31-
- **Create**: Create a new dataset within a collection.
32-
- **Edit**: Modify an existing dataset.
33-
- **Delete**: Delete an unpublished dataset.
34-
- **Upload**: Upload a file to a dataset.
35-
- **Publish**: Publish a dataset to make it publicly available.
36-
- **Link**: Link datasets to other collections.
15+
- **🚀 High Performance** - Built with async/await using Tokio and Reqwest for efficient concurrent operations
16+
- **🔒 Type Safety** - Leverage Rust's type system to catch errors at compile time
17+
- **⚡ Direct Upload** - Parallel batch uploads for fast file transfers to S3-compatible storage
18+
- **🎯 Dual Interface** - Use as a library in your Rust projects or as a standalone CLI tool
19+
- **🔐 Secure Authentication** - Multiple auth methods including system keyring integration for credential storage
20+
- **📦 Flexible Configuration** - JSON and YAML support for all configuration files
3721

38-
### File Management
22+
## Features
3923

40-
- **Replace**: Replace existing files in a dataset.
24+
- **📚 Collections** - Create, publish, and manage Dataverse collections with hierarchical organization support
25+
- **📊 Datasets** - Full dataset lifecycle management including creation, metadata editing, versioning, publishing, linking, and deletion. Support for dataset locks and review workflows
26+
- **📁 Files** - Upload files via standard or direct upload (with parallel batch support), replace existing files, download files and complete datasets, and manage file metadata
27+
- **🔍 Search** - Query datasets and files across your Dataverse instance with flexible search parameters
28+
- **🛠️ Administration** - Manage storage drivers, configure external tools, and perform administrative operations
29+
- **ℹ️ Instance Information** - Retrieve version information and available metadata exporters from your Dataverse instance
4130

4231
## Installation
4332

44-
**Command line**
33+
### CLI Installation
34+
35+
Install the command-line tool directly from the repository:
4536

4637
```bash
4738
cargo install --git https://github.com/JR-1991/rust-dataverse.git
4839
```
4940

50-
**Cargo.toml**
41+
### Library Installation
5142

52-
Please note, this crate is not yet published on crates.io. You can add it to your `Cargo.toml` file by pointing to the
53-
GitHub repository.
43+
Add to your `Cargo.toml`:
5444

5545
```toml
5646
[dependencies]
5747
dataverse = { git = "https://github.com/JR-1991/rust-dataverse" }
5848
```
5949

50+
> **Note:** Not yet published on crates.io. Pre-1.0 releases will be available soon.
51+
6052
## Usage
6153

62-
### Command line
54+
### Library Usage
55+
56+
The library provides an async API built on `tokio` and `reqwest`. Import the prelude for common types:
57+
58+
```rust
59+
use dataverse::prelude::*;
60+
61+
#[tokio::main]
62+
async fn main() -> Result<(), Box<dyn std::error::Error>> {
63+
// Initialize client
64+
let client = BaseClient::new(
65+
"https://demo.dataverse.org",
66+
Some("your-api-token")
67+
)?;
68+
69+
// Get instance version
70+
let version = info::get_version(&client).await?;
71+
println!("Dataverse version: {}", version.data.unwrap());
72+
73+
// Create a dataset
74+
let dataset_body = dataset::create::DatasetCreateBody {
75+
// ... configure metadata
76+
..Default::default()
77+
};
78+
let response = dataset::create_dataset(&client, "root", dataset_body).await?;
79+
80+
// Upload a file
81+
let file = UploadFile::from("path/to/file.csv");
82+
let identifier = Identifier::PersistentId("doi:10.5072/FK2/ABCDEF".to_string());
83+
dataset::upload_file_to_dataset(&client, identifier, file, None, None).await?;
84+
85+
Ok(())
86+
}
87+
```
6388

64-
Before you can use the command line tool, you need to set the `DVCLI_URL` and `DVCLI_TOKEN` environment variables. You
65-
can do this by adding the following lines to your `.bashrc` or `.bash_profile` file:
89+
**Key Library Modules:**
6690

67-
```bash
68-
export DVCLI_URL="https://your.dataverse.url"
69-
export DVCLI_TOKEN="your_token_here"
70-
```
91+
- `dataverse::client::BaseClient` - HTTP client for API interactions
92+
- `dataverse::native_api::collection` - Collection operations
93+
- `dataverse::native_api::dataset` - Dataset operations
94+
- `dataverse::native_api::file` - File operations
95+
- `dataverse::native_api::admin` - Administrative operations
96+
- `dataverse::search_api` - Search functionality
97+
- `dataverse::direct_upload` - Direct upload with parallel batch support
98+
- `dataverse::data_access` - File and dataset downloads
99+
100+
### CLI Usage
101+
102+
The CLI provides three flexible authentication methods:
71103

72-
The command line tool in organized in subcommands. To see a list of available subcommands, run:
104+
#### 1. Profile-Based (Recommended)
105+
106+
Store credentials securely in your system keyring:
73107

74108
```bash
75-
dvcli --help
109+
# Create a profile
110+
dvcli auth set --name production --url https://dataverse.org --token your-api-token
111+
112+
# Use the profile
113+
dvcli --profile production info version
76114
```
77115

78-
To see help for a specific subcommand, run:
116+
#### 2. Environment Variables
117+
118+
Set environment variables for automatic authentication:
79119

80120
```bash
81-
dvcli <subcommand> --help
121+
export DVCLI_URL="https://demo.dataverse.org"
122+
export DVCLI_TOKEN="your-api-token"
123+
124+
dvcli dataset meta doi:10.5072/FK2/ABC123
82125
```
83126

84-
**Example**
127+
#### 3. Interactive Mode
85128

86-
In this examples we will demonstrate how to retrieve the version of the Dataverse instance.
129+
If neither profile nor environment variables are set, the CLI will prompt for credentials:
87130

88131
```bash
89132
dvcli info version
133+
# Prompts for URL and token
90134
```
91135

92-
The output will be similar to:
136+
**Common CLI Operations:**
137+
138+
> **Note:** Configuration files can be provided in both JSON and YAML formats.
93139
94140
```bash
95-
Calling: http://localhost:8080/api/info/version
96-
└── 🎉 Success! - Received the following response:
141+
# Get help
142+
dvcli --help
143+
dvcli dataset --help
97144

98-
{
99-
"version": "6.2"
100-
}
145+
# Collections
146+
dvcli collection create --parent root --body collection.json
147+
dvcli collection publish my-collection
148+
149+
# Datasets
150+
dvcli dataset create --collection root --body dataset.json # or dataset.yaml
151+
dvcli dataset upload --id doi:10.5072/FK2/ABC123 data.csv
152+
dvcli dataset publish doi:10.5072/FK2/ABC123
153+
154+
# Direct upload (faster for large files)
155+
dvcli dataset direct-upload --id doi:10.5072/FK2/ABC123 --parallel 5 file1.csv file2.csv
156+
157+
# Files
158+
dvcli file replace --id 12345 --path new-file.csv
159+
dvcli file download file-pid.txt --path ./downloads/
160+
161+
# Search
162+
dvcli search -q "climate change" -t dataset -t file
163+
164+
# Admin
165+
dvcli admin storage-drivers
166+
dvcli admin add-external-tool tool-manifest.json
101167
```
102168

103169
## Examples
104170

105-
We have provided an example in the `examples` directory. These examples demonstrate how to use the client to perform
106-
various operations.
171+
Complete workflow examples are available in the [`examples/`](examples/) directory:
172+
173+
- **[create-upload-publish](examples/create-upload-publish)** - End-to-end workflow demonstrating collection and dataset creation, file upload, and publishing using shell scripts and the CLI.
174+
175+
Besides these examples, you can also find some recipes in the [Dataverse Recipes](https://github.com/gdcc/dataverse-recipes/tree/main/dvcli) repository, which cover most of the functionality of the CLI.
176+
177+
## Development
178+
179+
### Running Tests
180+
181+
Tests require a running Dataverse instance. We provide a convenient test script that handles infrastructure setup:
182+
183+
```bash
184+
# Run all tests (starts Docker containers automatically)
185+
./run-tests.sh
186+
187+
# Run a specific test
188+
./run-tests.sh test_create_dataset
189+
```
190+
191+
The script automatically:
192+
193+
- Starts Dataverse with PostgreSQL and Solr via Docker Compose
194+
- Waits for services to be ready
195+
- Configures environment variables
196+
- Executes the test suite
197+
198+
Docker containers remain running after tests complete for faster subsequent runs. View logs with `docker logs dataverse` if you encounter issues.
199+
200+
### Manual Test Setup
201+
202+
For granular control during development:
203+
204+
```bash
205+
# Start infrastructure
206+
docker compose -f ./docker/docker-compose-base.yml --env-file local-test.env up -d
207+
208+
# Configure environment
209+
export BASE_URL=http://localhost:8080
210+
export DV_VERSION=6.2
211+
export $(grep "API_TOKEN" "dv/bootstrap.exposed.env")
212+
export API_TOKEN_SUPERUSER=$API_TOKEN
213+
214+
# Run tests
215+
cargo test
216+
cargo test -- --nocapture # with output
217+
cargo test test_name # specific test
218+
cargo test collection:: # module tests
219+
```
220+
221+
## Contributing
222+
223+
Contributions are welcome! Whether you're fixing bugs, adding features, or improving documentation, your help is appreciated. Please feel free to open issues or submit pull requests on [GitHub](https://github.com/JR-1991/rust-dataverse).
224+
225+
## Community
107226

108-
* [`examples/create-upload-publish`](examples/create-upload-publish) - Demonstrates how to create a collection, dataset,
109-
upload a file, and publish the collection and dataset.
227+
Join the conversation on the [Dataverse Zulip Channel](https://dataverse.zulipchat.com)! Connect with other developers, get help, share ideas, and discuss the future of Rust clients for Dataverse.
110228

111-
## ToDo's
229+
## License
112230

113-
- [ ] Implement remaining API endpoints
114-
- [x] Write unit and integration tests
115-
- [x] Asynchronous support using `tokio`
116-
- [x] Documentation
117-
- [ ] Publish on crates.io
118-
- [x] Continuous integration
119-
- [ ] Validate before upload using `/api/dataverses/$ID/validateDatasetJson`
231+
This project is licensed under the MIT License - see the [License.md](License.md) file for details.

conf/localstack/init-s3.sh

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
#!/bin/bash
2+
3+
# Create the mybucket bucket for Dataverse S3 storage
4+
awslocal s3 mb s3://mybucket
5+
6+
echo "S3 bucket 'mybucket' created successfully"
7+

0 commit comments

Comments
 (0)