ogp-fetch

Open Graph / Twitter Card metadata extractor CLI. Fetch a URL (or pipe HTML on stdin) and get back a clean JSON blob of everything you need to render a link preview: title, description, image, site name, Twitter Card fields, canonical URL, favicon.

Every team that builds a Slack bot, a Discord integration, a blog with link cards, or a notifications pipeline eventually writes this code. It's a small, well-defined problem that's easy to get wrong — og: takes precedence over twitter:, both take precedence over <title>, relative image URLs need resolving, HTML entities need decoding. ogp-fetch does those things in ~80 lines of Python stdlib + httpx.

ogp-fetch https://example.com/article

{
  "url": "https://example.com/article",
  "title": "Hello World",
  "description": "An example page",
  "image": "https://example.com/static/hero.png",
  "site_name": "Example",
  "type": "article",
  "twitter": { "card": "summary_large_image", "site": "@example", "creator": null },
  "canonical": "https://example.com/article",
  "favicon": "https://example.com/favicon.ico"
}

Install

pip install .

Runtime dependency: httpx only.

Usage

# Fetch a URL and emit JSON (default).
ogp-fetch https://example.com/

# Human-readable key: value layout.
ogp-fetch https://example.com/ --format human

# Markdown link-card preview (great for README / Slack).
ogp-fetch https://example.com/ --format markdown

# Pipe HTML from anywhere.
curl -s https://example.com/ | ogp-fetch - --no-resolve

# Pipe with a base URL so relative og:image paths still resolve.
curl -s https://example.com/ | ogp-fetch - --no-resolve --base-url https://example.com/

Options

Flag	Default	Description
`--format {json,human,markdown}`	`json`	Output format
`--user-agent STRING`	`ogp-fetch/0.1.0 (…)`	Sent as the `User-Agent` header
`--timeout SECONDS`	`10`	HTTP timeout
`--max-size BYTES`	`2097152` (2 MB)	Refuse responses larger than this
`--base-url URL`	(fetched URL)	Resolve relative links against this
`--no-resolve`	off	Skip HTTP entirely; requires `-` as the URL

Exit codes

Code	Meaning
`0`	Metadata found
`1`	Fetched/parsed successfully but no OGP / Twitter / `<title>` data
`2`	Fetch, parse, or argument error

Why the stdin mode

ogp-fetch - --no-resolve reads HTML from stdin and emits the same JSON without any network traffic. That lets you:

plug it into a curl pipeline without making ogp-fetch responsible for TLS or retries;
test your extraction on a captured HTML fixture;
run it in a sandbox or offline CI job.

Precedence rules

The extractor collects every meta tag it finds; the normalizer picks the winner:

Field	First checked	Then	Last resort
`title`	`og:title`	`twitter:title`	`<title>`
`description`	`og:description`	`twitter:description`	`<meta name="description">`
`image`	`og:image`	`og:image:url` → `twitter:image`	—
`canonical`	`<link rel="canonical">`	`og:url`	—

Relative URLs in og:image, twitter:image, canonical, and favicon are resolved to absolute using urllib.parse.urljoin(base_url, value). Protocol-relative //cdn.example.com/x.png works too.

Docker

docker build -t ogp-fetch .
docker run --rm ogp-fetch --help

# Pipe HTML in:
cat page.html | docker run --rm -i ogp-fetch - --no-resolve --format markdown

Image is multi-stage Alpine, non-root, under 90 MB.

Tests

pip install ".[dev]"
pytest -q

All network paths are exercised via httpx.MockTransport — the test suite never touches the real network.

License

MIT

Links

📝 dev.to: https://dev.to/sendotltd/every-link-preview-in-your-app-is-the-same-30-lines-you-havent-written-yet-1an8

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
src/ogp_fetch		src/ogp_fetch
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ogp-fetch

Install

Usage

Options

Exit codes

Why the stdin mode

Precedence rules

Docker

Tests

License

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ogp-fetch

Install

Usage

Options

Exit codes

Why the stdin mode

Precedence rules

Docker

Tests

License

Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages