Skip to content
Get started
Operator Reference

Ranking Formulas

Complete reference for the rankBy expression language and custom ranking in repository search.

Repository search supports custom ranking formulas via the rankBy parameter. This allows you to control how semantic similarity, popularity, and activity signals are combined.

Repository search uses a multi-query strategy:

  1. Vector query - Semantic similarity via ANN search on embeddings
  2. Popularity query - Repositories with high star counts
  3. Activity query - Repositories with many closed issues
  4. Recency query - Recently updated repositories

Results from all queries are deduplicated, then scored using your rankBy formula.

Performance optimization: The API automatically detects if your rankBy only uses ann and skips multi-query for better performance and lower cost.

The default formula is log-normalized 70/20/10:

0.7 × ann + 0.2 × log₁₀(stars) / log₁₀(500k) + 0.1 × log₁₀(issues_closed) / log₁₀(200k)

This balances:

  • 70% semantic similarity - Finds conceptually relevant repos
  • 20% popularity - Boosts well-established projects
  • 10% activity - Boosts actively maintained repos
AttributeTypeDescriptionRange
annnumberVector similarity score0-1 (1 = most similar)
starsnumberRepository star count0 - ∞
issues_closednumberTotal closed issues0 - ∞
agenumberDays since repository creation0 - ∞
recencynumberDays since last update0 - ∞

Add multiple scores together.

{
"type": "Sum",
"exprs": [
{ "type": "Attr", "name": "ann" },
{ "type": "Attr", "name": "stars" }
]
}

Multiply two expressions.

{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.7 },
{ "type": "Attr", "name": "ann" }
]
}

Divide one expression by another.

{
"type": "Div",
"exprs": [
{ "type": "Attr", "name": "stars" },
{ "type": "Const", "value": 1000 }
]
}

Return the maximum of multiple expressions.

{
"type": "Max",
"exprs": [
{ "type": "Attr", "name": "stars" },
{ "type": "Const", "value": 1 }
]
}

Return the minimum of multiple expressions.

{
"type": "Min",
"exprs": [
{ "type": "Attr", "name": "ann" },
{ "type": "Const", "value": 0.9 }
]
}

Logarithm with custom base.

{
"type": "Log",
"base": 10,
"expr": { "type": "Attr", "name": "stars" }
}

Reference an attribute value.

{ "type": "Attr", "name": "ann" }

A constant numeric value.

{ "type": "Const", "value": 0.5 }

Uses only vector search - skips multi-query automatically.

await client.searchRepos.search({
query: "react hooks",
rankBy: { type: "Attr", name: "ann" },
});

When to use: Prototyping, exploratory search, when quality signals don’t matter.

Emphasizes popular projects.

{
"type": "Sum",
"exprs": [
{ "type": "Attr", "name": "ann" },
{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.0002 },
{ "type": "Attr", "name": "stars" }
]
}
]
}

When to use: Finding “best” or “most popular” libraries in a category.

Emphasizes actively maintained projects.

{
"type": "Sum",
"exprs": [
{ "type": "Attr", "name": "ann" },
{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.0001 },
{ "type": "Attr", "name": "issues_closed" }
]
}
]
}

When to use: Finding actively maintained projects, excluding abandoned repos.

The full default formula with log normalization:

{
"type": "Sum",
"exprs": [
{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.7 },
{ "type": "Attr", "name": "ann" }
]
},
{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.2 },
{
"type": "Div",
"exprs": [
{
"type": "Log",
"base": 10,
"expr": {
"type": "Max",
"exprs": [
{ "type": "Attr", "name": "stars" },
{ "type": "Const", "value": 1 }
]
}
},
{ "type": "Const", "value": 5.699 }
]
}
]
},
{
"type": "Mult",
"exprs": [
{ "type": "Const", "value": 0.1 },
{
"type": "Div",
"exprs": [
{
"type": "Log",
"base": 10,
"expr": {
"type": "Max",
"exprs": [
{ "type": "Attr", "name": "issues_closed" },
{ "type": "Const", "value": 1 }
]
}
},
{ "type": "Const", "value": 5.301 }
]
}
]
}
]
}

The TypeScript SDK provides helper functions for building expressions:

import { sum, mult, div, max, log, attr, c, logNorm } from "@bountylab/sdk";
// Balanced formula
const rankBy = sum(
mult(c(0.7), attr("ann")),
mult(c(0.2), logNorm("stars", 500000)),
mult(c(0.1), logNorm("issues_closed", 200000)),
);
await client.searchRepos.search({
query: "web framework",
rankBy,
});
HelperDescriptionExample
sum(...exprs)Add expressionssum(attr('ann'), attr('stars'))
mult(a, b)Multiply two expressionsmult(c(0.5), attr('ann'))
div(a, b)Divide expressionsdiv(attr('stars'), c(1000))
max(...exprs)Maximum valuemax(attr('stars'), c(1))
min(...exprs)Minimum valuemin(attr('ann'), c(0.9))
log(expr, base)Logarithmlog(attr('stars'), 10)
attr(name)Attribute referenceattr('ann')
c(value)Constant valuec(0.7)
logNorm(attr, max)Log-normalized attributelogNorm('stars', 500000)
rankBy TypeMulti-QueryCostSpeed
Pure ANN (ann only)No1xFastest
With quality signalsYes4xSlower
Custom (quality signals)Auto-detected4-5xSlower
  1. Start with default - The 70/20/10 formula works well for most use cases
  2. Use pure ANN for prototyping - Skip multi-query for faster iteration
  3. Combine with filters - Use stargazerCount >= 100 filter instead of relying solely on rankBy
  4. Log-normalize large numbers - Raw star counts overwhelm other signals
  5. Test with real queries - Different formulas work better for different query types

For simple expressions, you can use a string format:

await client.searchRepos.search({
query: "react",
rankBy: "Sum(ann, Mult(stars, 0.0001))",
});

Supported syntax:

  • Sum(expr1, expr2, ...)
  • Mult(expr1, expr2)
  • Max(expr1, ...)
  • Attribute names: ann, stars, issues_closed, age, recency
  • Numbers: 0.5, 1000, etc.