How to Use HTML to Markdown Online — Step-by-Step Guide
Free HTML to Markdown online — convert html to markdown handling headings, lists, tables, links & code. Step-by-step guide with tips. 100% free, works on mob...

Try this tool now — 100% free, no signup required
Open ToolA backend engineer at a Bangalore fintech recently spent a full Saturday copying 140 API documentation pages from an old Confluence space into a new GitHub-hosted MkDocs site. The HTML pasted into VS Code looked terrible — <p> tags wrapping every line, inline <span style="..."> styling, broken table layouts, and <a href> URLs with tracking parameters that shouldn't be in a clean README. Three hours in, he gave up the manual approach and ran each page through an HTML to Markdown converter. The remaining 110 pages took 40 minutes.
That's the actual job this tool does. It takes messy HTML — from a CMS export, a scraped page, a Confluence dump, a Word doc saved as web, a ChatGPT response with rendered formatting — and produces clean Markdown that drops cleanly into a README, a Hugo blog post, a Jekyll page, a Notion import, or a GitHub issue. The HTML to Markdown converter on SabTools handles the conversion entirely in your browser; you paste markup on the left, Markdown comes out on the right.
Why Indian dev teams keep needing this conversion
Markdown won the developer documentation war somewhere around 2018. GitHub READMEs, GitLab wikis, Stack Overflow answers, Discord, Slack code blocks, Hugo and Jekyll static sites, MkDocs, Docusaurus, Obsidian notes, Notion exports — all of them speak Markdown. Meanwhile, the legacy content most Indian companies sit on lives in HTML: WordPress posts, Drupal pages, Confluence spaces, custom CMSes built in PHP a decade ago.
Some concrete scenarios where Indian developers run this conversion every week:
- WordPress to Hugo/Jekyll migration. A Chennai-based SaaS company with 800 blog posts moves off WordPress to cut hosting costs from ₹18,000/month to ₹400/month on Cloudflare Pages. Each post's
post_contentfield is HTML and needs Markdown. - Razorpay/PayU integration docs. A developer copies the rendered HTML from a payment gateway's documentation page into the internal team wiki, which uses Markdown.
- Confluence to GitHub Wiki. Engineering teams at TCS, Infosys, and Wipro client projects routinely move client documentation between systems during handover.
- Email templates to docs. Marketing teams in Mumbai send HTML promotional emails; the dev team needs the copy in Markdown for the help center.
- AI assistant outputs. ChatGPT, Claude, and Gemini frequently return HTML-formatted text that looks fine in chat but breaks when pasted into a
.mdfile.
For every one of these, the manual approach — find-and-replace in VS Code, hand-deleting <div> wrappers, fixing list indentation — is slow and error-prone. A converter that understands HTML semantics does the job in seconds.
What the converter actually handles
A naive HTML-to-Markdown approach (just strip the angle brackets) breaks immediately on anything beyond a paragraph. Real HTML has nested structures, attributes, semantic elements, and edge cases that need rule-based handling. Here's what gets translated:
- Headings:
<h1>through<h6>become#through######. - Inline formatting:
<strong>and<b>become**bold**;<em>and<i>become*italic*;<code>becomes backticks. - Links:
<a href="https://razorpay.com/docs">Razorpay Docs</a>becomes[Razorpay Docs](https://razorpay.com/docs). - Images:
<img src="upi-flow.png" alt="UPI flow">becomes. - Lists: Ordered (
<ol>) and unordered (<ul>) lists, including nested lists, become1.and-with correct indentation. - Code blocks:
<pre><code>wrapped content becomes fenced code blocks with triple backticks. - Tables:
<table>with<thead>and<tbody>rows becomes GitHub-flavored Markdown tables with pipes and dashes. - Blockquotes:
<blockquote>becomes>prefixed lines. - Horizontal rules:
<hr>becomes---.
Things that get stripped or simplified: inline style attributes, presentational <span> wrappers, <div> containers (since Markdown has no concept of generic blocks), and most class names. That's intentional — the whole point of Markdown is to be portable and unstyled. If you need to preserve CSS-driven styling, you don't actually want Markdown.
A practical walk-through
Suppose you pulled this HTML snippet from a fintech blog post about UPI transaction limits:
<h2>UPI Limits in 2026</h2>
<p>NPCI raised the <strong>per-transaction UPI limit</strong> for specific categories to <strong>₹5,00,000</strong>. Standard P2P transfers remain at <strong>₹1,00,000</strong> per day.</p>
<ul>
<li>Tax payments: ₹5,00,000</li>
<li>Hospital & education: ₹5,00,000</li>
<li>IPO/RBI Retail Direct: ₹5,00,000</li>
</ul>
<p>Read the <a href="https://npci.org.in/PDF/npci/upi/circular/2024/UPI-OC-203.pdf">official NPCI circular</a>.</p>
After pasting that into the converter and clicking Convert, you get:
## UPI Limits in 2026
NPCI raised the **per-transaction UPI limit** for specific categories to **₹5,00,000**. Standard P2P transfers remain at **₹1,00,000** per day.
- Tax payments: ₹5,00,000
- Hospital & education: ₹5,00,000
- IPO/RBI Retail Direct: ₹5,00,000
Read the [official NPCI circular](https://npci.org.in/PDF/npci/upi/circular/2024/UPI-OC-203.pdf).
That output drops straight into a Hugo content file, a GitHub README, or a Notion page. The rupee symbol, the Indian comma grouping, and the link all survive intact. If your source HTML had the rupee as an entity (₹), you'd want to run it through the HTML encoder/decoder first to decode entities to actual characters before converting.
Tables: the part most converters get wrong
Tables are where most free HTML-to-Markdown converters fall apart. A typical Indian use case: you're documenting GST rates and your source HTML has a four-column table with merged cells, inline styling, and <br> tags inside cells. Markdown tables don't support merged cells or line breaks inside cells natively — so the converter has to make sensible decisions.
SabTools' converter handles standard tables cleanly. A table like:
<table>
<thead>
<tr><th>Slab</th><th>Rate</th><th>Example</th></tr>
</thead>
<tbody>
<tr><td>5%</td><td>Essentials</td><td>Packaged food</td></tr>
<tr><td>18%</td><td>Standard</td><td>Most services</td></tr>
</tbody>
</table>
becomes:
| Slab | Rate | Example |
| ---- | -------- | --------------- |
| 5% | Essentials | Packaged food |
| 18% | Standard | Most services |
If you're actually trying to do GST math on those slabs rather than just document them, the GST calculator handles 5%, 12%, 18%, and 28% computations directly. For tables with merged cells, the converter will flatten them — you'll need to clean those rows up by hand, but that's a Markdown limitation, not a tool flaw.
Code blocks and developer-specific quirks
If you're converting technical documentation, code blocks matter the most. The converter looks for <pre><code> pairs and emits triple-backtick fenced blocks. If the source HTML uses syntax-highlighter classes like language-python or language-js, those get preserved as language hints in the output:
```python
import razorpay
client = razorpay.Client(auth=("rzp_test_XXXX", "secret"))
order = client.order.create({"amount": 50000, "currency": "INR"})
```
One gotcha worth knowing: HTML that wraps code in just a <code> tag (without <pre>) gets treated as inline code, with single backticks. If you have multi-line code that's only wrapped in <code>, the line breaks will be lost. Most modern CMSes and editors use <pre><code> correctly, but some older WordPress themes do not — worth eyeballing the output if your source is from a pre-2018 site.
Developers handling API responses where HTML is embedded in JSON strings will often pipe the JSON through the JSON formatter first to extract the HTML payload cleanly, then run it through this converter. That two-step flow comes up frequently when migrating CMS data via REST API exports.
Common Indian developer workflows where this fits
Static site migrations
A Pune-based EdTech startup with 1,200 lesson pages on a custom PHP CMS moved to Astro for performance reasons. Their Lighthouse scores went from 42 to 96, and hosting moved from a ₹6,500/month VPS to free Cloudflare Pages. The migration script pulled each lesson's HTML body from MySQL, ran it through an HTML-to-Markdown pipeline, and wrote .md files into the Astro content collection. A team of two engineers completed the migration in eight days.
Internal documentation cleanup
Indian services companies — Infosys, TCS, HCL, Wipro — frequently inherit client documentation in whatever format the client used. When a project hands over from one team to the next, normalizing everything to Markdown in a Git repo makes future updates traceable. Running 200 Confluence exports through a converter takes an afternoon; rewriting them by hand takes a month.
Blog content reuse
A freelance technical writer in Jaipur publishes the same article on Medium, dev.to, and Hashnode. Medium exports as HTML; dev.to and Hashnode expect Markdown. Rather than maintain three versions, the writer publishes to Medium first, exports the HTML, converts to Markdown, and uses that as the source of truth for the other two platforms. Total time saved per article: about 20 minutes.
AI output normalization
When you ask Claude or ChatGPT to generate documentation and copy-paste the response into a .md file, the rendered formatting (bold, lists, headings) sometimes pastes as HTML rather than Markdown depending on your clipboard handler. Running the paste through the converter normalizes it before you commit to Git.
Python and command-line alternatives — and when to use them
Many developers search for "html to markdown python" because they want to script the conversion as part of a build pipeline. The standard library options worth knowing:
- html2text — the most popular Python library, available via
pip install html2text. Good defaults, configurable for line wrapping and link reference style. - markdownify —
pip install markdownify. Cleaner output for nested lists and tables in my experience. - turndown — JavaScript equivalent, runs in Node.js and the browser. SabTools' converter is built on a turndown-style rule engine, which is why the output matches what you'd get from a Node build script.
- pandoc — the heavyweight universal document converter. Overkill for HTML-to-MD but handy if you also need DOCX or LaTeX in the same pipeline.
Use the online converter when you're doing a one-off job or testing what the output will look like before committing to a build script. Use a Python or Node library when you're processing more than 50 files or running the conversion on every CI build. The output should be effectively identical for well-formed HTML.
What doesn't convert cleanly — and how to handle it
A few HTML patterns will always be lossy in Markdown:
- Inline styles and colored text. Markdown has no equivalent for
<span style="color: #FF6B00">. If you absolutely need the color, you'd embed inline HTML in your Markdown (most renderers allow it) or use a separate color reference. If you're picking accent colors for a doc theme, the color picker helps convert between HEX, RGB, and HSL values. - Iframes and embeds. YouTube embeds, CodePen iframes, Tweet embeds — all get stripped or converted to plain links. Some Markdown flavors (like MDX or Hugo shortcodes) support them via custom syntax; you'll need to add those back manually.
- SVG markup. Inline
<svg>tags get dropped. If you need to keep diagrams, export them as separate.svgfiles and reference them as images. The SVG editor lets you clean up exported SVG before linking it. - Forms and interactive elements.
<form>,<button>,<select>— none of these have Markdown equivalents. They'll be stripped silently. - Footnotes and definition lists. Standard Markdown doesn't support these; GFM and Pandoc-flavored MD do. Check what your target renderer expects.
For the lossy cases, the practical approach is: convert with the tool, then do a quick pass through the output to add back any custom shortcodes or inline HTML your target system needs. That hybrid workflow is faster than doing everything by hand.
Privacy and why browser-side conversion matters
The HTML you're converting often contains internal project information — unreleased product specs, customer data references, API keys mistakenly left in code samples, internal Confluence URLs. Sending that to a server-side converter means trusting the operator with your data.
SabTools' converter runs entirely in your browser. The HTML you paste never leaves your machine; the conversion happens in JavaScript locally, and the Markdown output is generated client-side. For developers at Indian banks (which have strict data localization requirements), fintech startups, or anyone handling client NDAs, that matters. You can verify by opening DevTools' Network tab — you won't see a request go out when you click Convert.
A short FAQ for things people actually ask
Does it preserve relative URLs in links?
Yes. If your HTML has <a href="/docs/api">, the Markdown output keeps /docs/api as-is. You'll typically want absolute URLs in standalone Markdown files, so do a find-and-replace pass after conversion if your target is a different domain.
What about HTML entities like &rupee; or ?
Standard entities (&, <, >, ", ) get decoded to their character equivalents. Numeric entities like ₹ (₹) also decode correctly. If you see leftover entities in your output, run the source through the HTML decoder first.
Can I convert just a fragment of a page?
Yes — paste only the fragment. The converter doesn't require a complete document with <html> and <body> tags. You can paste a single <div> or even a fragment without any wrapping element.
Will it work on a slow 4G connection?
The tool loads once (about 80KB of JavaScript) and runs offline after that. Conversion of a 50KB HTML document takes under 200ms on a mid-range Android phone. You don't need fast internet — you don't need internet at all once the page is open.
Paste your HTML into the HTML to Markdown converter and copy the cleaned-up Markdown straight into your README, Hugo post, or Notion page →