Most LangChain pipelines start the same way: get reliable content, then split, embed, and store it.
browser.city’s Request API is the fastest path for that first step: it renders pages and returns clean markdown in one call.
If a site needs interaction (auth, clicking, multi-step flows), use Sessions (Playwright) or Humanized REST (/v1/do/*) and only fall back to extraction when you have the state you need.
1) URL -> markdown (Request API)
request.ts
const apiKey = process.env.BROWSERCITY_API_KEY!;const opts = { method: "POST", headers: { Authorization: `Bearer ${apiKey}` } };const res = await fetch("https://api.browser.city/v1/requests", { ...opts, body: JSON.stringify({ url: "https://example.com", markdown: true }),}).then((r) => r.json());console.log(res.content);import osimport requestsapi_key = os.environ["BROWSERCITY_API_KEY"]res = requests.post( "https://api.browser.city/v1/requests", headers={"Authorization": f"Bearer {api_key}"}, json={"url": "https://example.com", "markdown": True},).json()print(res["content"])using System.Net.Http.Headers;using System.Net.Http.Json;var apiKey = Environment.GetEnvironmentVariable("BROWSERCITY_API_KEY")!;var http = new HttpClient();http.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);var res = await http.PostAsJsonAsync( "https://api.browser.city/v1/requests", new { url = "https://example.com", markdown = true });Console.WriteLine(await res.Content.ReadAsStringAsync());import java.net.URI;import java.net.http.*;var apiKey = System.getenv("BROWSERCITY_API_KEY");var http = HttpClient.newHttpClient();var body = "{\"url\":\"https://example.com\",\"markdown\":true}";var req = HttpRequest.newBuilder() .uri(URI.create("https://api.browser.city/v1/requests")) .header("Authorization", "Bearer " + apiKey) .POST(HttpRequest.BodyPublishers.ofString(body)) .build();var res = http.send(req, HttpResponse.BodyHandlers.ofString());System.out.println(res.body());
2) Create documents (LangChain + generic)
langchain.ts
import { Document } from "@langchain/core/documents";const doc = new Document({ pageContent: res.content, metadata: { source: res.url, contentType: res.contentType, status: res.status, },});from langchain_core.documents import Documentdoc = Document( page_content=res["content"], metadata={"source": res["url"], "content_type": res["contentType"], "status": res["status"]},)// LangChain is TS/Python; in C# keep the same shape (content + metadata).public record Document(string PageContent, Dictionary<string, object?> Metadata);var doc = new Document( res.content, new() { ["source"] = res.url, ["contentType"] = res.contentType, ["status"] = res.status, });// LangChain is TS/Python; in Java keep the same shape (content + metadata).import java.util.Map;record Document(String pageContent, Map<String, Object> metadata) {}var doc = new Document( res.content, Map.of( "source", res.url, "contentType", res.contentType, "status", res.status ));
From here:
- split markdown into chunks (character or token-based)
- embed and store in your vector DB
- run retrieval + generation on demand
What to use when
- Use Request API for 90% of ingestion (fast, simple, cheap).
- Use Sessions when you need real browser state and deterministic automation.
- Use Humanized REST when you want interactive steps but don’t want to run Playwright in your runtime.