Sentry Global Search JavaScript Library | Algolia |
---|---|
Installation Usage Configuration Results |
Constructing Algolia Records Ranking and Sorting Index settings Synonyms |
The Sentry Global Search JavaScript library provides an easy way to query across all Sentry static sites and get consistent, normalized results without needing to worry about Algolia configuration and the complexities of each index. Sources include
yarn add @sentry-internal/global-search
Initialize the search client with one or more site slugs. The order of the slugs determines the order of results.
import {SentryGlobalSearch} from '@sentry-internal/global-search';
// This will include all sites in the results
const search = new SentryGlobalSearch([
'docs',
'develop',
'help-center',
'blog',
]);
const results = await search.query('erlang');
By default, the SentryGlobalSearch constructor takes an array of slugs matching the supported sites. To provide more flexible options, you may provide an object instead.
const search = new SentryGlobalSearch([
{
site: 'docs',
pathBias: true,
},
'develop',
'help-center',
'blog',
]);
-
site
: Required String of a valid site slug. -
pathBias
: Optional Boolean indicating whether to bias path match results if a path is provided to the query. Default:false
. -
platformBias
: Optional Boolean indicating whether to bias platform match results if a platform is provided to the query. Default:true
. -
legacyBias
: Optional Boolean indicating whether to bias legacy results. Default:true
.
When more than one bias is configured, the following priority is used:
- Same or child path
- Same or parent platform
- Everything else
- Legacy docs
query
takes an optional second Object argument which can be used to configure the results.
const results = await search.query('configuration', {
searchAllIndexes: true,
platform: 'sentry.erlang',
});
-
path
— String of a path in the format of/foo/bar/
. Results with a path matching or subordinate will appear first. -
platform
— String of a valid SDK slug. Results matching this slug will appear first or afterpath
results. -
searchAllIndexes
— Boolean, defalt false. Searches all configured indexes if true. Otherwise, search only the first.
SentryGlobalSearch returns an Array of Site objects and normalizes the list of Hits so that components are straightforward to create. If a site is configured to include results from multiple indexes (for example, during a content migration), those hits will be combined in the final output as a single list of hits for that site.
[
{
"site": "docs",
"name": "Documentation",
"hits": [
{
"id": "bbb19a43-5e51-5397-8ba0-9112999b5153",
"site": "site-slug",
"title": "Section within document",
"text": "…snippet text is a paragraph within the document with <mark>content that matches</mark> the provided query…",
"url": "https://result.url#section-within-document",
"context": {
"context1": "If present, this is the primary context information",
"context2": "If present, this is secondary context information"
}
},
]
}
]]
The site object is what you'd expect.
-
site
— Slug for the site these results are associated with -
name
— Human friendly name of the site these results are associated with. -
hits
— Array of Hit objects representing search results.
A hit object contains search data from Algolia, normalized for use in Sentry search. Where indicated, text matching the given query is highlighted with unescaped mark
tags. All values are strings.
-
id
— objectID from Algolia. Useful as a Reactkey
. -
site
— Slug for the site this hit is from. -
title
- (Highlighted) Title of the hit. Typically, this is the section heading this record is under. -
context
— Object containing additional detail to contextualize the search result. Varies by site and by record.-
context1
— String representing primary context information. -
context2
— String representing secondary context information.
-
-
url
— Url to the match, including a deep link to the section it is in.
While not all indexes follow this guide, the preferred strategy for indexing records is thus: Rather than considering an entire document a record, each block level tag (excluding headings) is a record. That is, each paragraph is a record, lists are flattened into single records, etc.
Headings should not have their own records. They are searchable as the section
value of other records and are used as the distinct
value for deduplication.
Ideally, a record object should include the following keys:
-
section
:String
— Text of the last heading seen. Initially set to the document title. -
text
:String
— Text content of the record. -
keywords
:[String]
— Specific word a record should be searchable for which may not exist in the section or text.
-
title
:String
— Title of the document this record comes from. -
url
:String
— URL for the document -
anchor
:String
—id
attribute matchingsection
heading, for deep linking.
-
platforms
:[String]
— SDK slugs for platform sorting. -
pathSegments
:[String]
— Segmented of the document path for path sorting. -
position
:Number
— Position in the document. Starts at 0, increments for each record. -
sectionRank
:Number
— Rank of header. H1: 100, H2: 90, H3: 80. -
legacy
:Boolean
— Indicates whether this is a record within a legacy document.
Results are ranked using Algolia's built in algorithm. Ties are broken using the following prioritization: section
> keywords
> text
.
In some cases, we may wish to float results of pages that are subordinate to the current page higher than pages elsewhere in a site. That is, when on /foo/
results for /foo/bar/
should appear before results on /bat/
.
To do this, each record includes a pathSegments
array, containing all parent paths. For example, a record for /foo/bar/
will look like:
pathSegments: [
'/foo/'
'/foo/bar/'
]
When doing a search while on the page /foo/
, we tell Algolia to put all records containing a /foo/
path segment first in the list.
In most cases, searches are done in the context of a specific platform. We float the results from a given platform to the top of the list by indexing a record’s SDK and framework and then using Algolia's optionalFilters
to request the appropriate platform results. Additionally, we want a platform’s family results to also be promoted, for example, we should show JavaScript results under React results if the priority is React.
Records include a sdk
property and a framework
property. sdk
is the appropriate SDK slug and framework
is the appropriate framework slug, if applicable, or the SDK slug. The format is entity.sdk[.framework]
.
Example record:
sdk: 'sentry.javascript',
framework: 'sentry.javascript.react',
Using Algolia's optionalFilters
, each record scores a "point" each for an SDK match and a framework match.
This means, if the user is in the React SDK docs, we can prioritize our results as such:
- Put all records matching
sentry.javascript.react
first. - Show results which contain
sentry.javascript
next. - Show everything else last.
And if a user is in the JavaScript SDK docs, we can prioritize our results as such:
- Put all records matching
sentry.javascript
first. - Show results which contain
sentry.javascript.<framework>
next. - Show everything else last.
By including the entity
portion of the SDK slug, we also give ourselves the ability to filter 1st party SDKs higher than 3rd party SDKs.
We no longer index and use the entity
portion of the SDK slug, as we only index docs for 1st party SDKs.
Legacy docs should be searchable, but they should appear last. Records include a legacy
value which allows for sorting them last.
We consider a match at the top of the document more important than a match at the bottom of a document.
We consider a match inside an H1 more important than a match inside an H3.
If an index follows the record structure presented above, it should also use the preferred Algolia index settings.
Synonyms can be configured in the Algolia Synonym Config. This is used to tell Algolia that "C#" and "csharp" are the same, or that a search for "Cocoa" should show results for both "Swift" and "Objective-C". We also use it to catch common misspellings, so that a search for "reach" will also include "React" results.