Discovery
Discovery pulse
A cleaner view of what is new, what needs review, and which domains were only partially reachable. This page now prioritises curated discovery results instead of repeating the same homepage rows or surfacing obvious failed crawls as wins.
Queue health is still shown live, but the main tables below focus on meaningful discovery output rather than raw crawler residue.
Fresh discoveries
18
Curated live, indexable results
Needs review
18
Mixed or stronger negative signals
Limited access
12
Blocked, rate-limited, or partial fetches
Pending pipeline jobs
59
Active 380 · failed 319
Processed total
9,832,959
Live queue pulse
Curated discovery
Open search
Fresh discoveries
| Domain | Crawl health | Authority / Trust | Why it surfaced |
|---|---|---|---|
|
The GitHub Blog
github.blog
en-US
|
HTTP 200
47 ms · crawled 2026-06-03 15:49:55
|
49/88
Score 42/100
|
Quality 100 · 91 internal links
3,501 words · 3 schema
|
|
Now Book It
nowbookit.com
en-au
|
HTTP 200
1,041 ms · crawled 2026-06-04 02:29:49
|
49/92
Score 42/100
|
Quality 100 · 36 internal links
4,682 words · 5 schema
|
|
GitHub
github.com
en
|
HTTP 200
44 ms · crawled 2026-06-03 15:49:58
|
56/87
Score 42/100
|
Quality 93 · 75 internal links
2,445 words · 0 schema
|
|
Green Business Journal
greenbusinessjournal.co.uk
en-US
|
HTTP 200
1,686 ms · crawled 2026-03-23 03:59:45
|
36/75
Score 59/100
|
Quality 100 · 55 internal links
6,132 words · 2 schema
|
|
Forbes
forbes.com
en
|
HTTP 200
202 ms · crawled 2026-06-03 06:30:25
|
52/86
Score 42/100
|
Quality 88 · 547 internal links
9,889 words · 1 schema
|
|
Digital Bees
bees.digital
pt-BR
|
HTTP 200
1,902 ms · crawled 2026-06-04 02:32:36
|
45/84
Score 56/100
|
Quality 98 · 10 internal links
6,931 words · 1 schema
|
|
Lesy České republiky, s. p.
lesycr.cz
cs-CZ
|
HTTP 200
150 ms · crawled 2026-06-03 02:33:32
|
47/81
Score 42/100
|
Quality 100 · 121 internal links
1,655 words · 13 schema
|
|
freifunk-kreisgt.de
freifunk-kreisgt.de
de
|
HTTP 200
315 ms · crawled 2026-03-20 07:26:12
|
34/52
Score 59/100
|
Quality 91 · 164 internal links
1,828 words · 0 schema
|
|
jpmorgan.com.br
jpmorgan.com.br
pt-BR
|
HTTP 200
1,316 ms · crawled 2026-06-03 06:31:59
|
46/86
Score 42/100
|
Quality 85 · 64 internal links
2,368 words · 5 schema
|
|
githubstatus.com
githubstatus.com
en
|
HTTP 200
312 ms · crawled 2026-06-03 15:50:29
|
47/80
Score 42/100
|
Quality 86 · 15 internal links
6,694 words · 0 schema
|
|
Cortopia Studios
cortopia.com
en-US
|
HTTP 200
1,882 ms · crawled 2026-06-03 19:03:45
|
44/83
Score 42/100
|
Quality 89 · 18 internal links
2,721 words · 1 schema
|
|
FIS Global
fisglobal.com
en
|
HTTP 200
839 ms · crawled 2026-06-03 06:29:57
|
47/85
Score 42/100
|
Quality 91 · 37 internal links
8,330 words · 1 schema
|
|
GitHub
github.community
en
|
HTTP 200
1,766 ms · crawled 2026-06-03 15:50:27
|
47/77
Score 42/100
|
Quality 86 · 114 internal links
2,445 words · 1 schema
|
|
Open Photography Forums
openphotographyforums.com
en-US
|
HTTP 200
625 ms · crawled 2026-06-04 00:25:06
|
39/63
Score 42/100
|
Quality 88 · 181 internal links
3,522 words · 5 schema
|
|
GitHub Careers
github.careers
en
|
HTTP 200
1,208 ms · crawled 2026-06-03 15:50:34
|
44/81
Score 42/100
|
Quality 87 · 91 internal links
7,036 words · 0 schema
|
|
bny.com
bny.com
en-US
|
HTTP 200
877 ms · crawled 2026-06-03 06:30:35
|
44/82
Score 42/100
|
Quality 89 · 120 internal links
2,807 words · 1 schema
|
|
chase.com
chase.com
en-US
|
HTTP 200
659 ms · crawled 2026-06-03 06:32:15
|
47/83
Score 42/100
|
Quality 87 · 80 internal links
4,007 words · 0 schema
|
|
gulfair.com
gulfair.com
en
|
HTTP 200
398 ms · crawled 2026-03-20 15:24:44
|
39/42
Score 42/100
|
Quality 95 · 31 internal links
52,062 words · 0 schema
|
Review queue
Browse domains
Needs review
| Domain | Tags | Scores | Why it needs review |
|---|---|---|---|
| bombich.com |
24/100
Spam 59/100 · Fraud 36/100
|
HTTP 200 · indexable
Quality 69 · NSFW 0/100 · tag confidence 82/100
|
|
| dakotacon.org |
28/100
Spam 59/100 · Fraud 35/100
|
HTTP 200 · indexable
Quality 64 · NSFW 0/100 · tag confidence 49/100
|
|
| xxx.com |
28/100
Spam 31/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 75 · NSFW 100/100 · tag confidence 55/100
|
|
| groovecoaster.jp |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 74 · NSFW 0/100 · tag confidence 40/100
|
|
|
musicdiver.jp
MUSIC DIVER 公式サイト|株式会社タイトー
|
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 50/100
|
|
| artecolaquimica.cl |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 72 · NSFW 0/100 · tag confidence 40/100
|
|
| m4iler.cloud |
26/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 68 · NSFW 0/100 · tag confidence 36/100
|
|
| yomiuri-golf.co.jp |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 68 · NSFW 0/100 · tag confidence 50/100
|
|
| ic3.gov |
28/100
Spam 41/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 65 · NSFW 0/100 · tag confidence 50/100
|
|
| lyngvaer.no |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 63 · NSFW 0/100 · tag confidence 39/100
|
|
| jxself.org |
22/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 60 · NSFW 0/100 · tag confidence 49/100
|
|
| geologie-et-collections.fr |
42/100
Spam 45/100 · Fraud 27/100
|
HTTP 200 · indexable
Quality 76 · NSFW 0/100 · tag confidence 81/100
|
|
|
najbrt.cz
Studio Najbrt
|
42/100
Spam 45/100 · Fraud 27/100
|
HTTP 200 · indexable
Quality 72 · NSFW 0/100 · tag confidence 48/100
|
|
| hotmodelsagency.be |
35/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 77 · NSFW 0/100 · tag confidence 49/100
|
|
| owontechnology.eu |
42/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 77 · NSFW 0/100 · tag confidence 49/100
|
|
| esri-portugal.pt |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 75 · NSFW 0/100 · tag confidence 49/100
|
|
| krombachers-fassbrause.de |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 48/100
|
|
| spezi-krombacher.de |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 48/100
|
Crawler limits
Search all domains
Limited-access domains
These rows are separated from the fresh discovery tables so blocked, rate-limited, or partial fetches do not masquerade as top discoveries.
| Domain | Access state | Score snapshot | Notes |
|---|---|---|---|
| lovespassions.be |
HTTP 403
318 ms · crawled 2026-03-19 04:43:15
|
0/100
Authority 0 · Trust 0
|
partial fetch or weak confidence
Quality 27 · score confidence 39/100
|
| shopee.com.my |
HTTP 200
978 ms · crawled 2026-03-30 11:13:15
|
15/100
Authority 12 · Trust 0
|
partial fetch or weak confidence
Quality 34 · score confidence 28/100
|
| meta.ai |
HTTP 403
114 ms · crawled 2026-06-04 02:34:09
|
7/100
Authority 27 · Trust 31
|
partial fetch or weak confidence
Quality 28 · score confidence 12/100
|
| x.com |
Blocked / limited
355 ms · crawled 2026-06-04 02:34:06
|
33/100
Authority 40 · Trust 53
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 29 · score confidence 12/100
|
| linkedin.com |
Blocked / limited
409 ms · crawled 2026-06-04 02:33:46
|
34/100
Authority 42 · Trust 53
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| nuve.digital |
Blocked / limited
2,106 ms · crawled 2026-06-04 02:32:58
|
23/100
Authority 14 · Trust 30
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 21 · score confidence 12/100
|
| iaspcentral.com |
Blocked / limited
180 ms · crawled 2026-06-04 02:30:17
|
1/100
Authority 25 · Trust 28
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 14 · score confidence 12/100
|
| raidersshop.com.au |
Blocked / limited
464 ms · crawled 2026-06-04 02:29:53
|
26/100
Authority 27 · Trust 38
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| nrl.com |
Blocked / limited
3,807 ms · crawled 2026-06-04 02:29:51
|
25/100
Authority 32 · Trust 46
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 21 · score confidence 12/100
|
| raiders.com.au |
Blocked / limited
3,393 ms · crawled 2026-06-04 02:29:49
|
25/100
Authority 29 · Trust 34
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 21 · score confidence 12/100
|
| xiv.quest |
HTTP 200
1,058 ms · crawled 2026-06-04 01:05:33
|
24/100
Authority 15 · Trust 38
|
partial fetch or weak confidence
Quality 24 · score confidence 27/100
|
| chuhailabs.com |
Blocked / limited
1,448 ms · crawled 2026-06-03 19:04:08
|
24/100
Authority 15 · Trust 43
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 24 · score confidence 12/100
|