Remove LSID column from provisioned sample tables #7235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

labkey-nicka merged 69 commits into develop from fb_remove_sample_lsid

Dec 13, 2025

Contributor

labkey-nicka commented Dec 2, 2025 •

edited

Loading

Rationale

For the last few years sample types have had two pathways through the code for updating samples; row-by-row and data iterator. These two pathways are expected to overlap completely in terms of functional input/output, however, we often see issues where something is supported (or broken) in one and not the other. Having these two pathways is difficult to maintain, is easy to introduce bugs if both pathways are not considered, and overall gives us less confidence in the sample update process.

With these changes the row-by-row update pathway has been removed entirely. The data iterator pathway is now the only way samples are updated. This should help us with stability and reduce complexity. One challenge we have faced until now is how to reconcile these pathways because data iterator requires that all rows have the same columns specified where as row-by-row doesn't. To achieve functional parity in data iterator we now partition the rows by the columns specified before iterating. Each partition is then looped over, in order, until all rows are processed.

Additionally, this removes the duplicate LSID column from provisioned sample type tables. Additionally, it updates the data iterator pathway for samples to support being keyed by (RowId) or (Name, MaterialSourceId). Formerly, specifying (LSID) was the only way to invoke the data iterator pathway. With these changes specifying LSID as a key for sample update is no longer supported.

Sample Update Functional Changes:

LSID is no longer supported as a key for updating samples.
Specifying only LSID, without RowId or Name, will result in an error.
RowId is not accepted when merging samples. The sample name can be specified instead.
- If a user encounters this and is unable to workaround it we have a new experimental flag that will let them opt-out of this behavior and accept the RowId. The RowId column values will be ignored.

Related Pull Requests

Changes

Implement new data iterators to perform same checks as row-by-row pathway.
Consolidate logic for resolving parameters for ExistingRecordDataIterator via new experiment table interfaces getExistingRecordKeyColumnNames() and getExistingRecordSharedKeyColumnNames().
Support updating experiment object properties (a.k.a. vocabulary properties) on samples / data via data iterator
Upgrade script to drop LSID column from provisioned sample type tables
Updates to join provisioned table by rowId instead of LSID.
Loop over rows with same column shape and process in batches via data iterator.
Support renaming samples when rowId is specified via data iterator and via file import/update.
Remove row-by-row pathway.

labkey-nicka requested review from XingY and labkey-matthewb

December 2, 2025 14:43

labkey-nicka self-assigned this

This was referenced Dec 2, 2025

Remove LSID column from provisioned sample tables LabKey/labkey-ui-components#1901

Merged

Remove LSID column from provisioned sample tables LabKey/testAutomation#2798

Merged

labkey-matthewb reviewed

View reviewed changes

api/src/org/labkey/api/dataiterator/SampleUpdateAddColumnsDataIterator.java Show resolved Hide resolved

labkey-matthewb reviewed

View reviewed changes

api/src/org/labkey/api/dataiterator/TriggerDataBuilderHelper.java Show resolved Hide resolved

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java

    
                                      if (o instanceof String s)

                                      {

                                          aliquotParentName = StringUtilsLabKey.unquoteString((String) o);

                                          aliquotParentName = StringUtilsLabKey.unquoteString(s);

Contributor

labkey-matthewb Dec 3, 2025

I know this is existing code. Just pointing out that unquoteString() looks very out of place here???

Contributor

XingY Dec 3, 2025

Should be able to use ExperimentService.getParentValues here instead.

Contributor Author

labkey-nicka Dec 5, 2025

I plan on leaving as-is for this PR.

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java

    
                                  List<UploadSampleRunRecord> runRecords = new ArrayList<>();

                                  Set<String> keys = new LinkedHashSet<>();

                                  Set<Object> keys = new LinkedHashSet<>();

Contributor

labkey-matthewb Dec 3, 2025 •

edited

Loading

For type validation, should we use LongHashMap/CaseInsensitiveHashMap as appropriate and then cast to Set of Object? It's nice to move toward (runtime) type validating collections where possible.

Contributor Author

labkey-nicka Dec 5, 2025

Agreed, validation is good. However, I'm going to leave as-is for this PR.

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java Show resolved Hide resolved

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java

    
                              var map = DataIteratorUtil.createColumnNameMap(in);

                              if (map.containsKey(RootMaterialRowId.toString()) || !map.containsKey(RowId.toString()))

                                  return in;

                              var ret = new SimpleTranslator(in, ctx);

Contributor

labkey-matthewb Dec 3, 2025

Nothing wrong with this, but note to self: we should have a simple pattern for this that starts with WrapperDataIterator.

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java Show resolved Hide resolved

labkey-nicka force-pushed the fb_remove_sample_lsid branch from 84442f1 to 957aeb4 Compare

December 3, 2025 23:38

XingY reviewed

View reviewed changes

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Outdated Show resolved Hide resolved

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Outdated Show resolved Hide resolved

experiment/src/org/labkey/experiment/api/ExpMaterialTableImpl.java Show resolved Hide resolved

experiment/src/org/labkey/experiment/ExpDataIterators.java Outdated Show resolved Hide resolved

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Outdated Show resolved Hide resolved

labkey-nicka force-pushed the fb_remove_sample_lsid branch 2 times, most recently from f4d9bc6 to 7398e73 Compare

December 5, 2025 22:14

labkey-nicka added 14 commits

December 8, 2025 19:40


          Initial removal

7ecf40c


          More removal

d25ccfd


          Various updates

8bbd758


          getObjectPropertiesSelector

7d2b57b

nit

59354cb


          Initial ExpDataIterators refactor

32077be


          Restore merge/update dynamic

f363b2c


          Remove LSID support allowUpdate

e14d648


          Merge v Update Round 12373

d72b527


          Resolve keys

363ed9b


          Use getRows

62043e1


          Bump @LabKey packages

7167f3c


          ExistingRecordDataIterator: check for key columns rather than requiri…

ac7ba3a

…ng all specified

- always include table key columns


          comment

7ccfa37

labkey-nicka added 4 commits

December 9, 2025 14:49


          nits

89fc756


          Support cross-folder import rowId

4c2f4e0


          Merge branch 'develop' into fb_remove_sample_lsid

f67b9a8


          Bump @LabKey packages

b604358

XingY reviewed

View reviewed changes

experiment/src/org/labkey/experiment/ExpDataIterators.java Outdated Show resolved Hide resolved

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Show resolved Hide resolved

XingY reviewed

View reviewed changes

experiment/src/client/test/integration/SampleTypeCrud.ispec.ts Show resolved Hide resolved

labkey-nicka added 9 commits

December 9, 2025 22:08


          Bump @LabKey packages


          Convert key type

7c1354e


          Do not update MaterialSourceId

db05b44

fix

01a0376


          Merge branch 'develop' into fb_remove_sample_lsid

46fe825


          Merge branch 'develop' into fb_remove_sample_lsid

160471c


          Bump @LabKey packages

ac820de


          Use query table only

92332d9


          Allow RowId during merge

4c41fa1

labkey-nicka requested a review from XingY

December 11, 2025 18:03

XingY reviewed

View reviewed changes

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Show resolved Hide resolved

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java Show resolved Hide resolved

labkey-matthewb reviewed

View reviewed changes

experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java

    
                              );

                          }, DbScope.CommitTaskOption.POSTCOMMIT);

                                  var nextIndex = index + 1;

                                  while (nextIndex < rows.size() && rowKeys.equals(new CaseInsensitiveHashSet(rows.get(nextIndex).keySet())))

Contributor

labkey-matthewb Dec 11, 2025

Side-note: If the rows happen to be ArrayListMap then the keySet() objects should be the same (==) and this will be fast. I don't know in which code paths that would be the case.

Contributor

labkey-matthewb Dec 12, 2025

Oops, only the test version of ArrayListMap overrides keySet(), the server implementation should too!

labkey-matthewb approved these changes

View reviewed changes

labkey-nicka added 3 commits

December 11, 2025 16:46


          ExpMaterialImpl: set MaterialSourceId on insert

2f3da94


          Summary audit event, DataIteratorPartitions

ae75e1a


          Merge branch 'develop' into fb_remove_sample_lsid

243c9fb

XingY approved these changes

View reviewed changes

labkey-nicka added 4 commits

December 12, 2025 14:42


          Experimental feature, check for duplicates


          Merge branch 'develop' into fb_remove_sample_lsid

2c3d90a


          Merge branch 'develop' into fb_remove_sample_lsid

42c8bc8


          Bump @LabKey packages

b43d547

labkey-nicka merged commit f610a31 into develop

7 of 14 checks passed

labkey-nicka deleted the fb_remove_sample_lsid branch

December 13, 2025 19:03

labkey-nicka mentioned this pull request

Samples: query for each row to retain order #7269

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet