Go SDK for Arize Phoenix - an open-source observability platform for LLM applications.
go get github.com/agentplexus/go-phoenixpackage main
import (
"context"
"log"
phoenixotel "github.com/agentplexus/go-phoenix/otel"
"go.opentelemetry.io/otel/trace"
)
func main() {
ctx := context.Background()
// Register with Phoenix (sends traces to localhost:6006)
tp, err := phoenixotel.Register(
phoenixotel.WithProjectName("my-app"),
phoenixotel.WithBatch(true),
)
if err != nil {
log.Fatal(err)
}
defer tp.Shutdown(ctx)
// Create traces
tracer := tp.Tracer("my-service")
ctx, span := tracer.Start(ctx, "llm-call",
trace.WithAttributes(
phoenixotel.WithSpanKind(phoenixotel.SpanKindLLM),
phoenixotel.WithModelName("gpt-4"),
phoenixotel.WithInput("What is the capital of France?"),
),
)
// ... do work ...
span.SetAttributes(phoenixotel.WithOutput("The capital of France is Paris."))
span.End()
}package main
import (
"context"
"log"
phoenix "github.com/agentplexus/go-phoenix"
)
func main() {
ctx := context.Background()
client, err := phoenix.NewClient(
phoenix.WithBaseURL("http://localhost:6006"),
)
if err != nil {
log.Fatal(err)
}
// List projects
projects, _, err := client.ListProjects(ctx)
if err != nil {
log.Fatal(err)
}
for _, p := range projects {
log.Printf("Project: %s", p.Name)
}
}tp, err := phoenixotel.Register(
phoenixotel.WithSpaceID("your-space-id"), // From app.phoenix.arize.com/s/{space-id}
phoenixotel.WithAPIKey("your-api-key"),
phoenixotel.WithProjectName("my-project"),
)Or via environment variables:
export PHOENIX_SPACE_ID=your-space-id
export PHOENIX_API_KEY=your-api-keyMetrics are defined in omniobserve/llmops/metrics and can be used with the Phoenix evaluator:
import (
"context"
"fmt"
"os"
phoenix "github.com/agentplexus/go-phoenix"
"github.com/agentplexus/go-phoenix/evals"
"github.com/agentplexus/omniobserve/llmops"
"github.com/agentplexus/omniobserve/llmops/metrics"
"github.com/agentplexus/omnillm"
)
func main() {
ctx := context.Background()
// Setup omnillm for LLM-based metrics
llmClient, _ := omnillm.NewClient(omnillm.ClientConfig{
Provider: omnillm.ProviderNameOpenAI,
APIKey: os.Getenv("OPENAI_API_KEY"),
})
defer llmClient.Close()
llm := metrics.NewLLM(llmClient, "gpt-4o")
// Setup Phoenix client
phoenixClient, _ := phoenix.NewClient()
// Create evaluator
evaluator := evals.NewEvaluator(phoenixClient)
// Run evaluation with metrics
result, _ := evaluator.Evaluate(ctx, llmops.EvalInput{
Input: "What is the capital of France?",
Output: "The capital of France is London.",
Context: []string{"Paris is the capital of France."},
SpanID: "your-span-id", // Optional: records results to Phoenix
},
metrics.NewHallucinationMetric(llm), // LLM-based
metrics.NewExactMatchMetric(), // Code-based
)
// result.Scores contains evaluation results
for _, score := range result.Scores {
fmt.Printf("%s: %.2f\n", score.Name, score.Score)
}
}Available Metrics:
| Metric | Type | Description |
|---|---|---|
HallucinationMetric |
LLM | Detects unsupported claims |
RelevanceMetric |
LLM | Evaluates document relevance |
QACorrectnessMetric |
LLM | Checks answer correctness |
ToxicityMetric |
LLM | Detects harmful content |
ExactMatchMetric |
Code | Exact string comparison |
RegexMetric |
Code | Regex pattern matching |
ContainsMetric |
Code | Substring presence check |
| Variable | Description |
|---|---|
PHOENIX_COLLECTOR_ENDPOINT |
Phoenix collector endpoint (defaults to Phoenix Cloud when PHOENIX_SPACE_ID is set) |
PHOENIX_SPACE_ID |
Space identifier for Phoenix Cloud (e.g., "johncwang" from app.phoenix.arize.com/s/johncwang) |
PHOENIX_PROJECT_NAME |
Project name for traces |
PHOENIX_API_KEY |
API key for authentication |
PHOENIX_CLIENT_HEADERS |
Additional headers (W3C Baggage format) |
OTEL_EXPORTER_OTLP_ENDPOINT |
Fallback OTLP endpoint |
The sections below are ordered by typical usage: start with phoenix-otel to send traces, use phoenix-client to manage resources, then add phoenix-evals to evaluate outputs. These can be used independently—users may adopt any subset based on their needs.
| Feature | Python SDK | go-phoenix | Status |
|---|---|---|---|
| phoenix-otel | |||
| Register tracer provider | ✅ | ✅ | Parity |
| OTLP HTTP exporter | ✅ | ✅ | Parity |
| OpenInference attributes | ✅ | ✅ | Parity |
| Environment variables | ✅ | ✅ | Parity |
| Batch processing | ✅ | ✅ | Parity |
| phoenix-client (REST API) | |||
| Projects | ✅ | ✅ | Parity |
| Spans | ✅ | ✅ | Parity |
| Datasets | ✅ | ✅ | Parity |
| Experiments | ✅ | ✅ | Parity |
| Prompts | ✅ | ✅ | Parity |
| Annotations | ✅ | ✅ | Parity |
| Sessions | ✅ | ✅ | Parity |
| phoenix-evals | |||
| LLM evaluators | ✅ | ✅ | Parity |
| Hallucination detection | ✅ | ✅ | Parity |
| Relevance scoring | ✅ | ✅ | Parity |
| Q&A correctness | ✅ | ✅ | Parity |
| Toxicity detection | ✅ | ✅ | Parity |
| Exact match | ✅ | ✅ | Parity |
| Regex matching | ✅ | ✅ | Parity |
| Custom templates | ✅ | ✅ | Parity |
| Auto-instrumentation | |||
| OpenAI auto-instrument | ✅ | ❌ | N/A (Go limitation) |
| Anthropic auto-instrument | ✅ | ❌ | N/A (Go limitation) |
The otel package provides constants for OpenInference span kinds:
SpanKindLLM- LLM inference callsSpanKindChain- Chain/workflow operationsSpanKindTool- Tool/function callsSpanKindAgent- Agent operationsSpanKindRetriever- Document retrievalSpanKindEmbedding- Embedding generationSpanKindReranker- Reranking operationsSpanKindGuardrail- Guardrail checks
go-phoenix can be used as a provider for omniobserve:
import (
"github.com/agentplexus/omniobserve/llmops"
_ "github.com/agentplexus/go-phoenix/llmops" // Register provider
)
func main() {
provider, err := llmops.Open("phoenix",
llmops.WithEndpoint("http://localhost:6006"),
)
// ...
}| Feature | Phoenix (Python) | go-phoenix | omniobserve/llmops | Tests | Notes |
|---|---|---|---|---|---|
| Tracing | |||||
| StartTrace | ✅ | ✅ | ✅ | ✅ | Via phoenix-otel |
| StartSpan | ✅ | ✅ | ✅ | ✅ | Via phoenix-otel |
| SetInput/Output | ✅ | ✅ | ✅ | ✅ | OpenInference attributes |
| SetModel/Provider | ✅ | ✅ | ✅ | ✅ | OpenInference attributes |
| SetUsage (tokens) | ✅ | ✅ | ✅ | ✅ | OpenInference attributes |
| AddFeedbackScore | ✅ | ✅ | ✅ | ✅ | Via OTEL events |
| TraceFromContext | ✅ | ✅ | ✅ | ✅ | |
| SpanFromContext | ✅ | ✅ | ✅ | ✅ | |
| Nested Spans | ✅ | ✅ | ✅ | ✅ | |
| Span Types | ✅ | ✅ | ✅ | ✅ | general, llm, tool, retrieval, agent, chain |
| Duration/Timing | ✅ | ✅ | ✅ | ✅ | |
| Prompts | |||||
| CreatePrompt | ✅ | ✅ | ✅ | ✅ | Use WithPromptModel/WithPromptProvider |
| GetPromptLatest | ✅ | ✅ | ✅ | ✅ | |
| GetPromptVersion | ✅ | ✅ | ✅ | Via GetPrompt(name, versionID) | |
| GetPromptByTag | ✅ | ✅ | ✅ | ✅ | Via GetPrompt(name, tagName) |
| ListPromptVersions | ✅ | ✅ | ❌ | ||
| ListPrompts | ✅ | ✅ | ✅ | ✅ | |
| Datasets | |||||
| CreateDataset | ✅ | ✅ | ✅ | ✅ | |
| GetDataset | ✅ | ✅ | ✅ | By name | |
| GetDatasetById | ✅ | ✅ | ✅ | ✅ | |
| AddDatasetItems | ✅ | ✅ | ✅ | ✅ | |
| ListDatasets | ✅ | ✅ | ✅ | ✅ | |
| DeleteDataset | ✅ | ✅ | ✅ | ✅ | |
| Experiments | |||||
| CreateExperiment | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| RunExperiment | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| ListExperiments | ✅ | ✅ | ❌ | Not in omniobserve interface | |
| Projects | |||||
| CreateProject | ✅ | ✅ | ✅ | ✅ | |
| GetProject | ✅ | ✅ | ✅ | ||
| ListProjects | ✅ | ✅ | ✅ | ✅ | |
| SetProject | ✅ | ✅ | ✅ | ✅ | |
| Evaluation | |||||
| Evaluate | ✅ | ✅ | ✅ | ✅ | Run metrics |
| AddFeedbackScore | ✅ | ✅ | ✅ | ✅ | Record results |
| Annotations | |||||
| CreateAnnotation | ✅ | ✅ | ✅ | ✅ | |
| ListAnnotations | ✅ | ✅ | ✅ | ✅ |
Running omniobserve/llmops tests:
# Skip tests when no API key is set
go test -v ./llmops/
# Run tests with Phoenix Cloud
export PHOENIX_API_KEY=your-api-key
export PHOENIX_SPACE_ID=your-space-id
go test -v ./llmops/MIT License - see LICENSE for details.