-
Notifications
You must be signed in to change notification settings - Fork 7
Remove LSID column from provisioned sample tables #7235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
api/src/org/labkey/api/dataiterator/SampleUpdateAddColumnsDataIterator.java
Show resolved
Hide resolved
| if (o instanceof String s) | ||
| { | ||
| aliquotParentName = StringUtilsLabKey.unquoteString((String) o); | ||
| aliquotParentName = StringUtilsLabKey.unquoteString(s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is existing code. Just pointing out that unquoteString() looks very out of place here???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to use ExperimentService.getParentValues here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan on leaving as-is for this PR.
|
|
||
| List<UploadSampleRunRecord> runRecords = new ArrayList<>(); | ||
| Set<String> keys = new LinkedHashSet<>(); | ||
| Set<Object> keys = new LinkedHashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For type validation, should we use LongHashMap/CaseInsensitiveHashMap as appropriate and then cast to Set of Object? It's nice to move toward (runtime) type validating collections where possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, validation is good. However, I'm going to leave as-is for this PR.
| var map = DataIteratorUtil.createColumnNameMap(in); | ||
| if (map.containsKey(RootMaterialRowId.toString()) || !map.containsKey(RowId.toString())) | ||
| return in; | ||
| var ret = new SimpleTranslator(in, ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing wrong with this, but note to self: we should have a simple pattern for this that starts with WrapperDataIterator.
84442f1 to
957aeb4
Compare
experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java
Outdated
Show resolved
Hide resolved
experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java
Outdated
Show resolved
Hide resolved
experiment/src/org/labkey/experiment/api/SampleTypeUpdateServiceDI.java
Outdated
Show resolved
Hide resolved
f4d9bc6 to
7398e73
Compare
…ng all specified - always include table key columns
| ); | ||
| }, DbScope.CommitTaskOption.POSTCOMMIT); | ||
| var nextIndex = index + 1; | ||
| while (nextIndex < rows.size() && rowKeys.equals(new CaseInsensitiveHashSet(rows.get(nextIndex).keySet()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side-note: If the rows happen to be ArrayListMap then the keySet() objects should be the same (==) and this will be fast. I don't know in which code paths that would be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, only the test version of ArrayListMap overrides keySet(), the server implementation should too!
Rationale
For the last few years sample types have had two pathways through the code for updating samples; row-by-row and data iterator. These two pathways are expected to overlap completely in terms of functional input/output, however, we often see issues where something is supported (or broken) in one and not the other. Having these two pathways is difficult to maintain, is easy to introduce bugs if both pathways are not considered, and overall gives us less confidence in the sample update process.
With these changes the row-by-row update pathway has been removed entirely. The data iterator pathway is now the only way samples are updated. This should help us with stability and reduce complexity. One challenge we have faced until now is how to reconcile these pathways because data iterator requires that all rows have the same columns specified where as row-by-row doesn't. To achieve functional parity in data iterator we now partition the rows by the columns specified before iterating. Each partition is then looped over, in order, until all rows are processed.
Additionally, this removes the duplicate LSID column from provisioned sample type tables. Additionally, it updates the data iterator pathway for samples to support being keyed by (RowId) or (Name, MaterialSourceId). Formerly, specifying (LSID) was the only way to invoke the data iterator pathway. With these changes specifying LSID as a key for sample update is no longer supported.
Sample Update Functional Changes:
Related Pull Requests
Changes
ExistingRecordDataIteratorvia new experiment table interfacesgetExistingRecordKeyColumnNames()andgetExistingRecordSharedKeyColumnNames().