This guide will walk you through creating your first JSON API endpoint with Pulpminer. In just a few minutes, you’ll be able to transform any webpage into a structured JSON API.
Select a scraper (scraper 1, scraper 2, etc.) from the dropdown menu
Each scraper is optimized for different types of webpages
Try different scrapers to find the one that works best for your page
Select an LLM (AI Model) for JSON generation
Choose from available models (e.g., Gemini 2, GPT-4)
The LLM determines how the JSON is generated from the webpage
Enter a valid CSS selector
Specify a CSS selector to target a specific section of the webpage (e.g., ‘body’, ‘.main-content’, ‘#products’)
This helps extract only the relevant part of the page
Provide a Data Extraction Rule for AI
Optionally describe what you want the AI to extract (e.g., “Fetch all 20 products”, “Extract all blog post titles”)
This guides the LLM for more accurate results
Click “Generate JSON from webpage content”
Wait a few seconds while our AI analyzes the page
Review the generated JSON structure in the editor
The AI will automatically identify and extract key information from the webpage using your selected scraper. You can customize this structure in the next step.
Under “Saving Options”, decide if you want to enable caching:
Enabled: Faster responses, data updated every 15 minutes
Disabled: Real-time data, fresh content on every request
Configure session state (only available with Scraper 1):
Use Session: Establishes a session with the origin URL before connecting to the target URL. Useful for .gov websites and domains that require session authentication
No Session: Uses a fresh connection for each request
Optionally enable dynamic variables:
Toggle the “Enable Dynamic Variables” switch
Define variable names for URL path segments and query parameters
Preview how your dynamic URL will look with the variables