Zenscrape Connector - Ismena website

info@ismena.com

Zenscrape Connector

Connector Details

Type

Virtual machines, Single VM , BYOL

Runs on

Google Compute Engine

Last Update

24 October, 2024

Category

Overview

The Zenscrape Connector facilitates seamless integration with the Zenscrape API, providing access to web scraping capabilities for rendering web pages and checking API request limits. This connector acts as a proxy to streamline data retrieval, supporting actions for checking remaining requests and scraping web content with customizable parameters.

Integration Overview

This document provides a detailed guide for each integration point, its purpose, configuration, and workflow support using the Zenscrape Connector.

Supported Integration Action Points:

getStatus: Retrieves the number of remaining API requests for the provided API key.
getScrape: Retrieves rendered web page content for a specified URL with customizable scraping options (GET method).
postScrape: Sends a POST request to scrape a specified URL with form data (POST method).

Detailed Integration Documentation

2.1 Status Retrieval

Action	getStatus
Purpose	Retrieves the number of remaining API requests available for the provided API key. This serves as the primary entry point for monitoring API usage.
Parameters	Required: apiKey: Your API key (string, e.g., YOUR-API-KEY). Register at https://app.zenscrape.com to obtain. Optional: None.
Configuration	Ensure the connector is configured with the base URL via the CONNECTOR_ENV_ZENSCRAPE_BASE_URL environment variable (default: https://app.zenscrape.com).
Output	Successful: Returns a JSON object with: result: Success status (string, e.g., "success"). remaining_requests: Number of remaining API requests (integer, e.g., 286). Failure: Returns error details (e.g., error-type: invalid-api-key).
Workflow Example	Configure the connector with the appropriate base URL. Execute the getStatus action with apiKey. Process the response to monitor remaining API requests for usage tracking.

2.2 Web Content Scraping (GET)

Action	getScrape
Purpose	Retrieves rendered web page content for a specified URL, with options for premium proxies, location, rendering, and scrolling. This helps users obtain web data for analysis or integration.
Parameters	Required: apiKey: Your API key (string, e.g., YOUR-API-KEY). url: The URL to scrape (string, e.g., https://httpbin.org/anything). Optional: premium: Use premium proxies (boolean, e.g., true). location: Geolocation for the request (string, e.g., US). render: Render JavaScript on the page (boolean, e.g., true). scroll_to_bottom: Scroll to the bottom of the page before returning content (boolean, e.g., true).
Configuration	Ensure the connector is configured with the correct base URL.
Output	Successful: Returns a JSON object with: result: Success status (string, e.g., "success"). args: Query parameters (object, e.g., { "url": "https://httpbin.org/anything" }). data: Response data (string, e.g., ""). files: Uploaded files (object, e.g., {}). form: Form data (object, e.g., {}). headers: Request headers (object, e.g., { "Accept": "text/html,application/xhtml+xml,...", ... }). json: JSON data (null or object). method: HTTP method (string, e.g., "GET"). origin: IP address of the request (string, e.g., "38.123.116.248"). url: Resolved URL (string, e.g., https://httpbin.org/anything?url=https:%2F%2Fhttpbin.org%2Fanything). Headers: Date: Response timestamp (string, e.g., "Mon, 15 Feb 2021 15:22:30 GMT"). Content-Type: Response content type (string, e.g., "application/json"). Connection: Connection status (string, e.g., "keep-alive"). access-control-allow-origin: CORS policy (string, e.g., ""). access-control-allow-credentials: CORS credentials (boolean, e.g., true). Zenscrape-Resolved-Url: Resolved URL (string, e.g., https://httpbin.org/anything?...). X-Zenscrape-RateLimit-Remaining: Remaining requests (integer, e.g., 286). Cache-Control: Cache policy (string, e.g., "private, must-revalidate"). pragma: Cache pragma (string, e.g., "no-cache"). expires: Cache expiration (string, e.g., "-1"). x-robots-tag: Robots policy (string, e.g., "noindex"). CF-Cache-Status: Cloudflare cache status (string, e.g., "DYNAMIC"). cf-request-id: Cloudflare request ID (string, e.g., "0847e2433a00004c92179eb000000001"). Expect-CT: Expect-CT header (string). Report-To: Report-To header (string). NEL: Network Error Logging header (string). Server: Server type (string, e.g., "cloudflare"). CF-RAY: Cloudflare ray ID (string, e.g., "6220064b8fcd4c92-AMS"). Content-Encoding: Encoding type (string, e.g., "br"). Failure:* Returns error details (e.g., error-type: invalid-url or error-type: invalid-api-key).
Workflow Example	Execute the getScrape action with apiKey, url=https://httpbin.org/anything, and optional parameters like premium=true or render=true. Review the response to extract scraped web content. Use the content for data analysis or application integration.

2.3 Web Content Scraping (POST)

Action	postScrape
Purpose	Sends a POST request to scrape a specified URL with form data, allowing for dynamic web interactions. This enables users to submit form data during scraping.
Parameters	Required: apiKey: Your API key (string, e.g., YOUR-API-KEY). url: The URL to scrape (string, e.g., https://httpbin.org/anything). Optional: Form data: Key-value pairs (e.g., key=value, key_2=value_2).
Configuration	Ensure the connector is configured with the correct base URL.
Output	Successful: Returns a JSON object with: result: Success status (string, e.g., "success"). args: Query parameters (object, e.g., { "url": "https://httpbin.org/anything" }). data: Response data (string, e.g., ""). files: Uploaded files (object, e.g., {}). form: Form data submitted (object, e.g., { "key": "value", "key_2": "value_2" }). headers: Request headers (object, e.g., { "Accept": "text/html,application/xhtml+xml,...", ... }). json: JSON data (null or object). method: HTTP method (string, e.g., "POST"). origin: IP address of the request (string, e.g., "38.123.116.248"). url: Resolved URL (string, e.g., https://httpbin.org/anything?url=https:%2F%2Fhttpbin.org%2Fanything). Headers: Same as in getScrape (see above). Failure: Returns error details (e.g., error-type: invalid-url or error-type: invalid-api-key).
Workflow Example	Execute the postScrape action with apiKey, url=https://httpbin.org/anything, and form data (e.g., key=value, key_2=value_2). Save the scraped content for further processing. Use the result for dynamic web interactions or data extraction.

Workflow Creation with the Connector

Example Workflow: Web Scraping and API Usage Monitoring

Check API Usage	Use the getStatus action with apiKey to fetch the number of remaining API requests. Monitor usage to ensure sufficient requests are available for scraping tasks.
Scrape Web Content (GET)	Execute the getScrape action with apiKey, url=https://httpbin.org/anything, and optional parameters like premium=true, location=US, render=true, and scroll_to_bottom=true. Process the response to extract web content for analysis or integration.
Scrape Web Content with Form Data (POST)	Execute the postScrape action with apiKey, url=https://httpbin.org/anything, and form data (e.g., key=value, key_2=value_2). Use the scraped data for dynamic web interactions or to validate form submissions.
Integrate Scraped Data	Combine the scraped content from GET and POST requests for comprehensive data analysis. Integrate the data into applications, such as dashboards or reporting tools.

Pricing

bill-and-calculator-2023-11-27-05-25-06-utc 1

Request a Quote

Support

For Technical support please contact us on

custom-connectors-support@isolutions.sa

A tech solution company dedicated to providing innovation thus empowering businesses to thrive in the digital age.

Register to IBM x iSolution Event