Discovery
Discovery pulse
A cleaner view of what is new, what needs review, and which domains were only partially reachable. This page now prioritises curated discovery results instead of repeating the same homepage rows or surfacing obvious failed crawls as wins.
Queue health is still shown live, but the main tables below focus on meaningful discovery output rather than raw crawler residue.
Fresh discoveries
18
Curated live, indexable results
Needs review
18
Mixed or stronger negative signals
Limited access
12
Blocked, rate-limited, or partial fetches
Pending pipeline jobs
384
Active 5 · failed 372
Processed total
5,891,248
Live queue pulse
Curated discovery
Open search
Fresh discoveries
| Domain | Crawl health | Authority / Trust | Why it surfaced |
|---|---|---|---|
|
WordPress.com
wordpress.com
en
|
HTTP 200
160 ms · crawled 2026-04-17 06:12:24
|
55/90
Score 42/100
|
Quality 98 · 76 internal links
7,860 words · 1 schema
|
|
anam.ai
anam.ai
en
|
HTTP 200
345 ms · crawled 2026-04-17 06:06:53
|
48/84
Score 61/100
|
Quality 100 · 26 internal links
4,823 words · 3 schema
|
|
The GitHub Blog
github.blog
en-US
|
HTTP 200
23 ms · crawled 2026-04-17 06:09:26
|
51/90
Score 42/100
|
Quality 100 · 93 internal links
3,527 words · 3 schema
|
|
WordPress.org
wordpress.org
en-US
|
HTTP 200
420 ms · crawled 2026-04-17 06:11:56
|
55/89
Score 50/100
|
Quality 93 · 25 internal links
2,248 words · 1 schema
|
|
GitHub
github.com
en
|
HTTP 200
35 ms · crawled 2026-04-17 06:13:19
|
57/88
Score 42/100
|
Quality 93 · 76 internal links
2,443 words · 0 schema
|
|
The Leading Enterprise Content Platform | WordPress VIP
wpvip.com
en-US
|
HTTP 200
27 ms · crawled 2026-04-17 06:13:41
|
52/93
Score 42/100
|
Quality 99 · 61 internal links
4,806 words · 1 schema
|
|
The White House
whitehouse.gov
en-US
|
HTTP 200
53 ms · crawled 2026-04-17 06:13:06
|
54/92
Score 42/100
|
Quality 94 · 85 internal links
4,285 words · 2 schema
|
|
AbeBooks
abebooks.com
en
|
HTTP 200
612 ms · crawled 2026-04-17 06:10:46
|
51/88
Score 42/100
|
Quality 94 · 54 internal links
2,536 words · 1 schema
|
|
GQ
gq.com
en-US
|
HTTP 200
193 ms · crawled 2026-04-17 06:13:32
|
50/88
Score 42/100
|
Quality 94 · 73 internal links
40,381 words · 1 schema
|
|
Amazon Science
amazon.science
en
|
HTTP 200
103 ms · crawled 2026-04-17 06:10:21
|
49/87
Score 42/100
|
Quality 96 · 94 internal links
5,948 words · 0 schema
|
|
Veeqo
veeqo.com
en-US
|
HTTP 200
203 ms · crawled 2026-04-17 06:10:42
|
48/88
Score 42/100
|
Quality 93 · 25 internal links
2,628 words · 1 schema
|
|
Audible.com
audible.com
en-US
|
HTTP 200
714 ms · crawled 2026-04-17 06:10:34
|
50/85
Score 42/100
|
Quality 86 · 76 internal links
2,430 words · 2 schema
|
|
Automattic
automattic.com
en
|
HTTP 200
30 ms · crawled 2026-04-17 06:12:56
|
49/73
Score 42/100
|
Quality 97 · 9 internal links
4,035 words · 1 schema
|
|
ICIMS | The Leading Cloud Recruiting Software
icims.com
en
|
HTTP 200
449 ms · crawled 2026-04-17 06:09:32
|
49/87
Score 42/100
|
Quality 92 · 74 internal links
11,366 words · 1 schema
|
|
amazon.com
amazon.com
en-us
|
HTTP 200
768 ms · crawled 2026-04-17 06:11:00
|
52/80
Score 42/100
|
Quality 80 · 146 internal links
4,024 words · 0 schema
|
|
GitHub
github.community
en
|
HTTP 200
311 ms · crawled 2026-04-17 06:09:22
|
49/79
Score 42/100
|
Quality 91 · 107 internal links
2,452 words · 1 schema
|
|
Blink Smart Security
blinkforhome.com
en-US
|
HTTP 200
85 ms · crawled 2026-04-17 06:10:56
|
46/75
Score 42/100
|
Quality 94 · 16 internal links
4,531 words · 2 schema
|
|
WordPress.tv
wordpress.tv
en
|
HTTP 200
37 ms · crawled 2026-04-17 06:12:48
|
48/84
Score 42/100
|
Quality 90 · 27 internal links
1,882 words · 0 schema
|
Review queue
Browse domains
Needs review
| Domain | Tags | Scores | Why it needs review |
|---|---|---|---|
| bombich.com |
24/100
Spam 59/100 · Fraud 36/100
|
HTTP 200 · indexable
Quality 69 · NSFW 0/100 · tag confidence 82/100
|
|
| dakotacon.org |
28/100
Spam 59/100 · Fraud 35/100
|
HTTP 200 · indexable
Quality 64 · NSFW 0/100 · tag confidence 49/100
|
|
| xxx.com |
28/100
Spam 31/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 75 · NSFW 100/100 · tag confidence 55/100
|
|
| groovecoaster.jp |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 74 · NSFW 0/100 · tag confidence 40/100
|
|
|
musicdiver.jp
MUSIC DIVER 公式サイト|株式会社タイトー
|
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 50/100
|
|
| artecolaquimica.cl |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 72 · NSFW 0/100 · tag confidence 40/100
|
|
| m4iler.cloud |
26/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 68 · NSFW 0/100 · tag confidence 36/100
|
|
| yomiuri-golf.co.jp |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 68 · NSFW 0/100 · tag confidence 50/100
|
|
| ic3.gov |
28/100
Spam 41/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 65 · NSFW 0/100 · tag confidence 50/100
|
|
| lyngvaer.no |
28/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 63 · NSFW 0/100 · tag confidence 39/100
|
|
| jxself.org |
22/100
Spam 59/100 · Fraud 34/100
|
HTTP 200 · indexable
Quality 60 · NSFW 0/100 · tag confidence 49/100
|
|
| geologie-et-collections.fr |
42/100
Spam 45/100 · Fraud 27/100
|
HTTP 200 · indexable
Quality 76 · NSFW 0/100 · tag confidence 81/100
|
|
|
najbrt.cz
Studio Najbrt
|
42/100
Spam 45/100 · Fraud 27/100
|
HTTP 200 · indexable
Quality 72 · NSFW 0/100 · tag confidence 48/100
|
|
| hotmodelsagency.be |
35/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 77 · NSFW 0/100 · tag confidence 49/100
|
|
| owontechnology.eu |
42/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 77 · NSFW 0/100 · tag confidence 49/100
|
|
| esri-portugal.pt |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 75 · NSFW 0/100 · tag confidence 49/100
|
|
| krombachers-fassbrause.de |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 48/100
|
|
| spezi-krombacher.de |
24/100
Spam 45/100 · Fraud 26/100
|
HTTP 200 · indexable
Quality 73 · NSFW 0/100 · tag confidence 48/100
|
Crawler limits
Search all domains
Limited-access domains
These rows are separated from the fresh discovery tables so blocked, rate-limited, or partial fetches do not masquerade as top discoveries.
| Domain | Access state | Score snapshot | Notes |
|---|---|---|---|
| lovespassions.be |
HTTP 403
318 ms · crawled 2026-03-19 04:43:15
|
0/100
Authority 0 · Trust 0
|
partial fetch or weak confidence
Quality 27 · score confidence 39/100
|
| shopee.co.th |
HTTP 200
1,032 ms · crawled 2026-03-22 13:31:22
|
18/100
Authority 10 · Trust 4
|
partial fetch or weak confidence
Quality 34 · score confidence 14/100
|
| shopee.com.my |
HTTP 200
978 ms · crawled 2026-03-30 11:13:15
|
15/100
Authority 12 · Trust 0
|
partial fetch or weak confidence
Quality 34 · score confidence 28/100
|
| brightspace.com |
Blocked / limited
505 ms · crawled 2026-04-17 06:22:54
|
25/100
Authority 23 · Trust 50
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| rarathemes.com |
Blocked / limited
1,237 ms · crawled 2026-04-17 06:15:50
|
25/100
Authority 31 · Trust 48
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 24 · score confidence 12/100
|
| atproto.com |
Blocked / limited
703 ms · crawled 2026-04-17 06:14:19
|
27/100
Authority 31 · Trust 45
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| meta.ai |
HTTP 403
72 ms · crawled 2026-04-17 06:14:18
|
7/100
Authority 27 · Trust 32
|
partial fetch or weak confidence
Quality 28 · score confidence 12/100
|
| meta.com |
HTTP 429
1,234 ms · crawled 2026-04-17 06:14:13
|
11/100
Authority 34 · Trust 38
|
partial fetch or weak confidence
Quality 25 · score confidence 23/100
|
| bsky.social |
Blocked / limited
423 ms · crawled 2026-04-17 06:14:00
|
28/100
Authority 33 · Trust 47
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| publiccode.eu |
HTTP 200
68 ms · crawled 2026-04-17 06:14:00
|
14/100
Authority 12 · Trust 17
|
partial fetch or weak confidence
Quality 31 · score confidence 28/100
|
| twitter.com |
Blocked / limited
443 ms · crawled 2026-04-17 06:13:54
|
34/100
Authority 41 · Trust 53
|
Cloudflare challenge or bot protection blocked the crawler.
Quality 27 · score confidence 12/100
|
| nytimes.com |
Blocked / limited
134 ms · crawled 2026-04-17 06:13:26
|
30/100
Authority 35 · Trust 52
|
DataDome bot protection blocked the crawler.
Quality 29 · score confidence 12/100
|