-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
As described in #4 (comment), this library is to act as a bridge from Python Awkward Arrays into the Julia world, so that
- Julia can be used to accelerate tight loops (the way that Numba and C++ through cppyy or RDataFrame are currently being used, but with more freedom to create, fill, and iterate over arrays without a
snapshotphase) - Julia libraries that act on Arrow or AwkwardArray can be used in Python (such as a possible route from UnROOT.jl into Pythonic analysis, which would be faster than than Uproot and a plug-in replacement for it)
- Python libraries that act on Arrow or Awkward Arrays can be used in Julia
through PyJulia and PyCall.jl. This library should be usable on its own, exclusively in Julia, but the initial goal is to make Julia more accessible to Python users of Awkward Arrays.
The first phase of development (targeting JuliaHEP 2023) will require the following.
- Depth of Julia-side functionality: data model,
is_valid, int-getindex, range-getindex, iteration, equality (data equivalence, not layout equivalence),length/firstindex/lastindex, LayoutBuilder-style appending. - PrimitiveArray still needs multidimensional support to be one-to-one with NumpyArray. #6
- Array nodes must support
parameters, which implies a strict dependence on JSON.jl. #8 - ak.from_iter equivalent to convert from various Julia types into AwkwardArray. #10
-
String representation for the Awkward type. (No need for theTypeobjects we have in Python.) - String representation for data (following src/awkward/_prettyprint.py). #23
- All of the Awkward layout types. #12
- Actually implement the ak.to_buffers/ak.from_buffers equivalents on the Julia side. #24 (No need for the Form objects we have in Python; just navigate the JSON, since it only happens once. This might need to be a macro to customize output types.)
Nice to have:
- A Python module for round-tripping data between Python and Julia #14
- ak.to_arrow/ak.from_arrow equivalents on the Julia side, for better interop with the Julia packages that produce and consume Arrow data. (We don't want to round trip through Python for that.)
- Conversions to and from common Julia formats, such as ArraysOfArrays.jl and VectorOfArrays.
- add
from_tablethat uses Tables.jl interface #39 - Performance testing, probably using the jagged0/1/2/3 suite (synthetic) and the RNTuple suite (realistic analysis).
- Composition testing: can I swap in arrays with units? on GPUs? delayed processing? I'm using the
firstindex/lastindexprotocol to be offsets-safe—am I making any assumptions that will break naive composition? (Or are the other libraries?)
Moelf, tamasgal and Yuan-Ru-Lin
Metadata
Metadata
Assignees
Labels
No labels