-
Notifications
You must be signed in to change notification settings - Fork 124
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Bug description
demo code at section Extract and save Item embeddings raises no output schema exception.
Steps/Code to reproduce bug
just follow the demo
Expected behavior
Environment details
- Merlin version:
- merlin 0.0.1
- merlin-core 0+untagged.1.g6d396aa
- merlin-dataloader 0+untagged.1.g1441a12
- merlin-hps 1.0.0
- merlin-models 0+untagged.1.geb1e541
- merlin-sok 2.0.0
- merlin-systems 0+untagged.1.ga19d311
- Platform: Linux 12ce9556ef42 5.4.0-200-generic
- Python version: 3.10.12
- PyTorch version (GPU?): N/A
- Tensorflow version (GPU?): 2.12.0+nv23.6
Additional context
File "/root/raid/common_models/recommend_system/merlin/aliccp/extract_item_feature.py", line 56, in main
item_embeddings = workflow.fit_transform(Dataset(item_features)).to_ddf().compute()
File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 236, in fit_transform
self.fit(dataset)
File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 213, in fit
self.executor.fit(dataset, self.graph)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 501, in fit
).sample_dtypes()
File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset.py", line 1169, in sample_dtypes
_real_meta = self.engine.sample_data(n=n)
File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset_engine.py", line 64, in sample_data
_head = _ddf.partitions[partition_index].head(n)
File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1268, in head
return self._head(n=n, npartitions=npartitions, compute=compute, safe=safe)
File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1302, in _head
result = result.compute()
File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 314, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 599, in compute
results = schedule(dsk, keys, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/dask/threaded.py", line 89, in get
results = get_async(
File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 511, in get_async
raise_exception(exc, tb)
File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 319, in reraise
raise exc
File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 224, in execute_task
result = _execute_task(task, data)
File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/usr/local/lib/python3.10/dist-packages/dask/optimization.py", line 990, in __call__
return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 149, in get
result = _execute_task(task, cache)
File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task
return func(*(_execute_task(a, cache) for a in args))
File "/usr/local/lib/python3.10/dist-packages/dask/utils.py", line 72, in apply
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 103, in transform
transformed_data = self._execute_node(node, transformable, capture_dtypes, strict)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node
upstream_outputs = self._run_upstream_transforms(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms
node_output = self._execute_node(
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 125, in _execute_node
transform_output = self._run_node_transform(node, transform_input, capture_dtypes, strict)
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 255, in _run_node_transform
raise exc
File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 242, in _run_node_transform
transformed_data = node.op.transform(selection, input_data)
File "/usr/local/lib/python3.10/dist-packages/merlin/systems/dag/ops/workflow.py", line 107, in transform
output = self.workflow._transform_df(transformable)
File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 256, in _transform_df
raise ValueError("no output schema")
ValueError: no output schema
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working