Ranking Formulas

Operator Reference

Complete reference for the rankBy expression language and custom ranking in search.

Both repository and user search support custom ranking via the rankBy parameter. This allows you to control how results are scored and ordered.

How Ranking Works

Repository search uses a multi-query strategy:

Vector query - Semantic similarity via ANN search on embeddings
Popularity query - Repositories with high star counts
Activity query - Repositories with many closed issues
Recency query - Recently updated repositories

Results from all queries are deduplicated, then scored using your rankBy formula.

Performance optimization: The API automatically detects if your rankBy only uses ann and skips multi-query for better performance and lower cost.

Default Formulas

Repository Search (with query)

When a search query is provided, repos are ranked with a log-normalized 70/20/10 formula:

0.7 × ann + 0.2 × logNorm(stars, 500k) + 0.1 × logNorm(issues_closed, 200k)

This balances:

70% semantic similarity — finds conceptually relevant repos
20% popularity — boosts well-established projects
10% activity — boosts actively maintained repos

When no search query is provided (filter-only searches), there is no semantic similarity score. The weights are re-normalized to sum to 1.0, and saturate is used instead of log normalization:

0.667 × saturate(stars, 500k) + 0.333 × saturate(issues_closed, 200k)

The ratio between popularity and activity is preserved from the default weights (0.2 and 0.1), just scaled up to fill the full range.

User Search

User search defaults to ordering by updatedAt descending (most recently updated profiles first):

await client.searchUsers.search({
  query: "react developer",
  rankBy: ["updatedAt", "desc"],
});

We recommend changing this to suit your use case — for example, sorting by createdAt for account age, or combining with filters to get the ordering you need.

Available Attributes

Attribute	Type	Description	Range
`ann`	number	Vector similarity score	0-1 (1 = most similar)
`stars`	number	Repository star count	0 - ∞
`issues_closed`	number	Total closed issues	0 - ∞
`age`	number	Days since repository creation	0 - ∞
`recency`	number	Days since last update	0 - ∞

Expression Operations

Sum

Add multiple scores together.

{
  "type": "Sum",
  "exprs": [
    { "type": "Attr", "name": "ann" },
    { "type": "Attr", "name": "stars" }
  ]
}

Mult

Multiply two expressions.

{
  "type": "Mult",
  "exprs": [
    { "type": "Const", "value": 0.7 },
    { "type": "Attr", "name": "ann" }
  ]
}

Div

Divide one expression by another.

{
  "type": "Div",
  "exprs": [
    { "type": "Attr", "name": "stars" },
    { "type": "Const", "value": 1000 }
  ]
}

Max

Return the maximum of multiple expressions.

{
  "type": "Max",
  "exprs": [
    { "type": "Attr", "name": "stars" },
    { "type": "Const", "value": 1 }
  ]
}

Min

Return the minimum of multiple expressions.

{
  "type": "Min",
  "exprs": [
    { "type": "Attr", "name": "ann" },
    { "type": "Const", "value": 0.9 }
  ]
}

Log

Logarithm with custom base.

{
  "type": "Log",
  "base": 10,
  "expr": { "type": "Attr", "name": "stars" }
}

Attr

Reference an attribute value.

{ "type": "Attr", "name": "ann" }

Const

A constant numeric value.

{ "type": "Const", "value": 0.5 }

Saturate

Soft-cap a value using a saturation curve. Approaches 1 as the input grows, with midpoint controlling where the output reaches 0.5. An optional exponent (defaults to 1) controls how sharply the curve saturates.

{
  "type": "Saturate",
  "expr": { "type": "Attr", "name": "stars" },
  "midpoint": 1000
}

With a custom exponent for a steeper curve:

{
  "type": "Saturate",
  "expr": { "type": "Attr", "name": "stars" },
  "midpoint": 1000,
  "exponent": 2
}

This is useful as an alternative to log normalization for capping unbounded values like star counts into a 0–1 range. A repo with midpoint stars scores 0.5, and returns diminish beyond that.

Formula Examples

Pure Semantic (Fastest)

Uses only vector search - skips multi-query automatically.

await client.searchRepos.search({
  query: "react hooks",
  rankBy: { type: "Attr", name: "ann" },
});

When to use: Prototyping, exploratory search, when quality signals don’t matter.

Popularity-Focused

Emphasizes popular projects.

{
  "type": "Sum",
  "exprs": [
    { "type": "Attr", "name": "ann" },
    {
      "type": "Mult",
      "exprs": [
        { "type": "Const", "value": 0.0002 },
        { "type": "Attr", "name": "stars" }
      ]
    }
  ]
}

When to use: Finding “best” or “most popular” libraries in a category.

Activity-Focused

Emphasizes actively maintained projects.

{
  "type": "Sum",
  "exprs": [
    { "type": "Attr", "name": "ann" },
    {
      "type": "Mult",
      "exprs": [
        { "type": "Const", "value": 0.0001 },
        { "type": "Attr", "name": "issues_closed" }
      ]
    }
  ]
}

When to use: Finding actively maintained projects, excluding abandoned repos.

Balanced (Default Formula)

The full default formula with log normalization:

{
  "type": "Sum",
  "exprs": [
    {
      "type": "Mult",
      "exprs": [
        { "type": "Const", "value": 0.7 },
        { "type": "Attr", "name": "ann" }
      ]
    },
    {
      "type": "Mult",
      "exprs": [
        { "type": "Const", "value": 0.2 },
        {
          "type": "Div",
          "exprs": [
            {
              "type": "Log",
              "base": 10,
              "expr": {
                "type": "Max",
                "exprs": [
                  { "type": "Attr", "name": "stars" },
                  { "type": "Const", "value": 1 }
                ]
              }
            },
            { "type": "Const", "value": 5.699 }
          ]
        }
      ]
    },
    {
      "type": "Mult",
      "exprs": [
        { "type": "Const", "value": 0.1 },
        {
          "type": "Div",
          "exprs": [
            {
              "type": "Log",
              "base": 10,
              "expr": {
                "type": "Max",
                "exprs": [
                  { "type": "Attr", "name": "issues_closed" },
                  { "type": "Const", "value": 1 }
                ]
              }
            },
            { "type": "Const", "value": 5.301 }
          ]
        }
      ]
    }
  ]
}

SDK Helper Functions

The TypeScript SDK provides helper functions for building expressions:

import { sum, mult, div, max, log, attr, c, saturate, logNorm } from "@bountylab/sdk";

// Balanced formula
const rankBy = sum(
  mult(c(0.7), attr("ann")),
  mult(c(0.2), logNorm("stars", 500000)),
  mult(c(0.1), logNorm("issues_closed", 200000)),
);

await client.searchRepos.search({
  query: "web framework",
  rankBy,
});

Available Helpers

Helper	Description	Example
`sum(...exprs)`	Add expressions	`sum(attr('ann'), attr('stars'))`
`mult(a, b)`	Multiply two expressions	`mult(c(0.5), attr('ann'))`
`div(a, b)`	Divide expressions	`div(attr('stars'), c(1000))`
`max(...exprs)`	Maximum value	`max(attr('stars'), c(1))`
`min(...exprs)`	Minimum value	`min(attr('ann'), c(0.9))`
`log(expr, base)`	Logarithm	`log(attr('stars'), 10)`
`attr(name)`	Attribute reference	`attr('ann')`
`c(value)`	Constant value	`c(0.7)`
`saturate(expr, midpoint, exponent?)`	Soft-cap to 0–1 range	`saturate(attr('stars'), 1000)`
`logNorm(attr, max)`	Log-normalized attribute	`logNorm('stars', 500000)`

Performance Characteristics

rankBy Type	Multi-Query	Cost	Speed
Pure ANN (`ann` only)	No	1x	Fastest
With quality signals	Yes	4x	Slower
Custom (quality signals)	Auto-detected	4-5x	Slower

Best Practices

Start with default - The 70/20/10 formula works well for most use cases
Use pure ANN for prototyping - Skip multi-query for faster iteration
Combine with filters - Use stargazerCount >= 100 filter instead of relying solely on rankBy
Log-normalize large numbers - Raw star counts overwhelm other signals
Test with real queries - Different formulas work better for different query types

String Parser (Alternative)

For simple expressions, you can use a string format:

await client.searchRepos.search({
  query: "react",
  rankBy: "Sum(ann, Mult(stars, 0.0001))",
});

Supported syntax:

Sum(expr1, expr2, ...)
Mult(expr1, expr2)
Max(expr1, ...)
Attribute names: ann, stars, issues_closed, age, recency
Numbers: 0.5, 1000, etc.