Skip to content
Discussion options

You must be logged in to vote

@syed-al @mentaLwz @QuangTQV There are multiple flags and parameters you can use to control the level of data-cleaning you may need. Please share with me your URL, then I give you an example. However I share one example that I think will answer your question (This will be available in 0.3.72):

async def main():
    async with AsyncWebCrawler(headless = True, sleep_on_close = True) as crawler:
        url = "https://janineintheworld.com/places-to-visit-in-central-mexico"
        result = await crawler.arun(
            url=url,
            # bypass_cache=True,
            word_count_threshold = 10,
            excluded_tags = ['form'], # Optional - Default is None, this adds more control o…

Replies: 19 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by aravindkarnam
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 participants
Converted from issue

This discussion was converted from issue #181 on January 21, 2025 04:47.