Searching for Users and Repositories

Guides

Learn how to use semantic search for repositories and full-text search for users with powerful filtering.

Bounty Lab provides two powerful search capabilities: semantic search for repositories and full-text search for users.

Overview

Bounty Lab uses natural language queries combined with structured filters:

Repository Search: Semantic search - understands meaning and context
User Search: BM25 full-text search - fast keyword matching with relevance ranking
Filters: Structured filter objects for precise filtering (not query strings)

Searching Repositories

Repository search uses semantic search with vector embeddings to find repositories based on meaning and context, not just keywords.

Basic Semantic Search

const response = await client.searchRepos.search({
  query: "react component library with typescript",
});

// Returns repositories semantically similar to your query
console.log(response.repositories);
// - count: number of results
// - repositories: array of results with relevance scores

The semantic search understands intent and context:

// These queries find different but semantically related results:
await client.searchRepos.search({
  query: "machine learning for image classification",
});

await client.searchRepos.search({
  query: "computer vision neural networks",
});

Repository Search with Filters

Combine semantic search with structured filters to narrow results:

// Find TypeScript UI libraries with active communities
const response = await client.searchRepos.search({
  query: "component library design system",
  filters: {
    op: "And",
    filters: [
      { field: "language", op: "Eq", value: "TypeScript" },
      { field: "stargazerCount", op: "Gte", value: 1000 },
    ],
  },
  maxResults: 50,
});

Filter Structure

Filters use {field, op, value} pattern. Key operators:

Important: Use FTS for Location and Free-Text Fields

For fields like locations, company names, bios, and descriptions, always use ContainsAllTokens instead of Eq or In. These fields often have variations in spelling, formatting, and case (e.g., “San Francisco, CA” vs “san francisco” vs “SF”). Full-text search handles all variations automatically.

String fields (language, name, etc.):

{ field: 'language', op: 'Eq', value: 'Python' }
{ field: 'language', op: 'In', value: ['Python', 'JavaScript', 'Go'] }
{ field: 'language', op: 'NotIn', value: ['HTML', 'CSS'] }

Number fields (stars, issues):

{ field: 'stargazerCount', op: 'Gte', value: 1000 }  // >= 1000 stars
{ field: 'stargazerCount', op: 'Lte', value: 50000 } // <= 50000 stars
{ field: 'stargazerCount', op: 'Gt', value: 999 }    // > 999 stars (exclusive)
{ field: 'stargazerCount', op: 'Lt', value: 10000 }  // < 10000 stars (exclusive)

Full-text search (description, readme, locations, bio, emails, resolved locations):

// Use ContainsAllTokens - handles formatting variations automatically
{ field: 'lastContributorLocations', op: 'ContainsAllTokens', value: 'San Francisco' }
{ field: 'readmePreview', op: 'ContainsAllTokens', value: 'react typescript' }
{ field: 'bio', op: 'ContainsAllTokens', value: 'kubernetes cloud' }
{ field: 'resolvedCity', op: 'ContainsAllTokens', value: 'San Francisco' }
{ field: 'resolvedCountry', op: 'ContainsAllTokens', value: 'United States' }

Combining filters:

// AND - all conditions must match
{
  op: 'And',
  filters: [
    { field: 'language', op: 'Eq', value: 'Python' },
    { field: 'stargazerCount', op: 'Gte', value: 1000 }
  ]
}

// OR - any condition can match (use FTS for locations)
{
  op: 'Or',
  filters: [
    { field: 'resolvedCity', op: 'ContainsAllTokens', value: 'San Francisco' },
    { field: 'resolvedCity', op: 'ContainsAllTokens', value: 'Seattle' }
  ]
}

Repository Filter Fields

Repositories can be filtered by:

githubId - Node ID (string)
ownerLogin - Repository owner username (string)
name - Repository name (string)
stargazerCount - Star count (number, supports Gte/Lte)
language - Primary programming language (string)
totalIssuesCount - Total issues (number, supports Gte/Lte)
totalIssuesOpen - Open issues (number, supports Gte/Lte)
totalIssuesClosed - Closed issues (number, supports Gte/Lte)
lastContributorLocations - Contributor locations (string)

For complete field documentation, see the Repository Fields Reference.

Controlling Result Size

// Get more results (max 1000)
const response = await client.searchRepos.search({
  query: "data visualization",
  maxResults: 500,
});

// Default is 100 results
const response2 = await client.searchRepos.search({
  query: "data visualization",
});

Searching Users

User search uses BM25 full-text search - a keyword-based search algorithm optimized for finding relevant matches across text fields.

Basic Full-Text Search

const response = await client.searchUsers.search({
  query: "machine learning engineer san francisco",
});

// Searches across: emails (3x weight), login (2x weight), displayName, bio, company, location
// Email addresses are weighted highest for precision
console.log(response.users);

How Full-Text Search Works

BM25 ranks results by keyword relevance, not semantic meaning:

// Finds users with these keywords in their profile
await client.searchUsers.search({
  query: "rust compiler developer",
});

// Better: combine search with filters for precision
await client.searchUsers.search({
  query: "rust developer",
  filters: {
    field: "resolvedCountry",
    op: "ContainsAllTokens",
    value: "United States",
  },
});

User Search with Filters

// Find developers in specific locations
const response = await client.searchUsers.search({
  query: "senior engineer",
  filters: {
    op: "And",
    filters: [
      {
        field: "resolvedCity",
        op: "ContainsAllTokens",
        value: "San Francisco",
      },
      { field: "company", op: "In", value: ["Google", "Meta", "Apple"] },
    ],
  },
  maxResults: 100,
});

User Filter Fields

Users can be filtered by:

githubId - Node ID (string)
login - Username (string)
company - Company name (string)
location - User-provided location (string)
emails - Email addresses (string)
resolvedCountry - Resolved country from location (string)
resolvedState - Resolved state/region (string)
resolvedCity - Resolved city (string)

For complete field documentation, see the User Fields Reference.

Language Tips

See the list of languages to filter with here.

Search Best Practices

1. Use Semantic Search for Repositories

Repository search understands context and meaning:

// Good - semantic search finds conceptually related repos
await client.searchRepos.search({
  query: "lightweight frontend framework for single page applications",
});

// The above finds Vue, Svelte, etc. even without exact keyword matches

2. Use Keywords for User Search

User search is keyword-based, not semantic:

// Good - specific keywords from likely profile fields
await client.searchUsers.search({
  query: "typescript react developer",
});

// Less effective - too abstract for keyword search
await client.searchUsers.search({
  query: "experienced frontend engineer with modern stack expertise",
});

3. Combine Search with Filters

Always prefer filters over trying to encode filtering in your query:

// Bad - trying to filter via search query
await client.searchRepos.search({
  query: "python machine learning 1000+ stars",
});

// Good - use structured filters
await client.searchRepos.search({
  query: "machine learning",
  filters: {
    op: "And",
    filters: [
      { field: "language", op: "Eq", value: "Python" },
      { field: "stargazerCount", op: "Gte", value: 1000 },
    ],
  },
});

4. Start Broad, Then Narrow

Begin with a simple query to understand results, then add filters:

// Step 1: Broad search to see what's available
const broad = await client.searchRepos.search({
  query: "api client library",
});

// Step 2: Add filters based on what you learned
const narrow = await client.searchRepos.search({
  query: "api client library",
  filters: {
    op: "And",
    filters: [
      { field: "language", op: "In", value: ["TypeScript", "JavaScript"] },
      { field: "stargazerCount", op: "Gte", value: 100 },
    ],
  },
});

5. Understand Result Scoring

Both search types return relevance scores:

const response = await client.searchRepos.search({
  query: "database orm",
});

response.repositories.forEach((repo) => {
  // Lower scores = more relevant for cosine distance
  console.log(`${repo.name}: ${repo.score}`);
});

Common Patterns

Find Popular Projects in Specific Language

const response = await client.searchRepos.search({
  query: "web framework",
  filters: {
    op: "And",
    filters: [
      { field: "language", op: "Eq", value: "Go" },
      { field: "stargazerCount", op: "Gte", value: 1000 },
    ],
  },
  maxResults: 50,
});

Find High-Quality Repos by Contributor Location

Use ContainsAllTokens on locations - handles all formatting variations automatically:

// Quality Python ML repos with San Francisco contributors
// (handles "San Francisco, CA" / "san francisco" / "SF, California" etc.)
const response = await client.searchRepos.search({
  query: "machine learning",
  filters: {
    op: "And",
    filters: [
      {
        field: "lastContributorLocations",
        op: "ContainsAllTokens",
        value: "San Francisco",
      },
      { field: "language", op: "Eq", value: "Python" },
      { field: "stargazerCount", op: "Gte", value: 1000 },
    ],
  },
  maxResults: 50,
});

Find Repos by Technology Stack

Use ContainsAllTokens on readmePreview or description to find repos by tech stack:

// Find React component libraries (checks README, not just name)
const response = await client.searchRepos.search({
  query: "ui component library",
  filters: {
    op: "And",
    filters: [
      {
        field: "readmePreview",
        op: "ContainsAllTokens",
        value: "react typescript",
      },
      { field: "stargazerCount", op: "Gte", value: 500 },
    ],
  },
});

// Find Kubernetes-native apps with quality threshold
const response2 = await client.searchRepos.search({
  query: "cloud native application",
  filters: {
    op: "And",
    filters: [
      {
        field: "description",
        op: "ContainsAllTokens",
        value: "kubernetes helm",
      },
      { field: "stargazerCount", op: "Gte", value: 100 },
      { field: "language", op: "In", value: ["Go", "Rust"] },
    ],
  },
});

Find Active, Well-Maintained Projects

// Active projects with good community engagement
const response = await client.searchRepos.search({
  query: "web framework",
  filters: {
    op: "And",
    filters: [
      { field: "stargazerCount", op: "Gte", value: 1000 },
      { field: "totalIssuesOpen", op: "Gte", value: 10 },
      { field: "totalIssuesClosed", op: "Gte", value: 50 },
      { field: "language", op: "NotIn", value: ["HTML", "CSS"] },
    ],
  },
});

Find Developers in Tech Hubs

// Rust developers in major tech cities (use FTS for locations)
const response = await client.searchUsers.search({
  query: "rust systems programming",
  filters: {
    op: "Or",
    filters: [
      {
        field: "resolvedCity",
        op: "ContainsAllTokens",
        value: "San Francisco",
      },
      { field: "resolvedCity", op: "ContainsAllTokens", value: "Seattle" },
      { field: "resolvedCity", op: "ContainsAllTokens", value: "New York" },
    ],
  },
  maxResults: 100,
});

Find Developers by Skills + Location

Use ContainsAllTokens on bio to find specific expertise:

// Kubernetes experts in US tech hubs
const response = await client.searchUsers.search({
  query: "infrastructure engineer",
  filters: {
    op: "And",
    filters: [
      {
        field: "bio",
        op: "ContainsAllTokens",
        value: "kubernetes cloud native",
      },
      {
        op: "Or",
        filters: [
          {
            field: "resolvedCity",
            op: "ContainsAllTokens",
            value: "San Francisco",
          },
          { field: "resolvedCity", op: "ContainsAllTokens", value: "Seattle" },
          { field: "resolvedCity", op: "ContainsAllTokens", value: "Austin" },
        ],
      },
    ],
  },
});

Find Developers at Specific Companies

// ML engineers at AI companies (using email domains)
const response = await client.searchUsers.search({
  query: "machine learning",
  filters: {
    op: "Or",
    filters: [
      { field: "emails", op: "ContainsAllTokens", value: "@openai.com" },
      { field: "emails", op: "ContainsAllTokens", value: "@anthropic.com" },
      { field: "company", op: "Eq", value: "Google DeepMind" },
    ],
  },
});

Next Steps

Check the Filter Operators Reference for complete operator documentation
Review the Repository Fields and User Fields references
Learn about custom ranking formulas