Log in to a website

A simple guide to understand how to log in to any website with Phantombuster.

As you know a lot of websites have their own identification system. Behind this identification, there is often highly valuable data for your business or personal use. It could also be data you generated with your personal account on the target site, and now you want to get it out to integrate it somewhere else.

Most of the scrapers out there don't handle this kind of stuff. But don't worry, with Phantombuster it's easy. Let's see!

Let's fill a login form

First of all, how do you log in to an account on a website?
When you do it manually as a human, you click on each input and fill them one by one, then you click to submit the form. That's exactly what you can do with Phantombuster and the following Puppeteer's methods:

await page.type("#email", "[email protected]")
await page.type("#password", "johnjohn")
await page.click("button[type=\"submit\"]")

Whoa... so many things? I am going to explain all of this.

  • The first argument of both these methods is a CSS selector, you can find it using your developer console on the website you want to scrape. Here the login form looks like this:
<form method="post" action="/login/auth">
    <div class="form-group">
        <label for="email">Email</label>
        <input type="email" class="form-control" name="email" id="email">
    </div>
    <div class="form-group">
        <label for="password">Password</label>
        <input type="password" class="form-control" name="password" id="password">
    </div>
    <button type="submit" class="btn btn-default">Connect</button>
</form>

We can see that the form contains two fields (html <input/>) having id attributes. The CSS selector for an id is #<the_id>. In our case: #email and #password!

  • We fill both these fields with our credentials.
  • The form is ready to be submited, we just have to click the submit button! The button's selector is button[type="submit"], meaning a button tag having a type attribute of value "submit".

Complete example

For this example, I am going to use our challenges site.

You'll need to log in with these credentials:
[email protected] as email and johnjohn as the password.

  1. First create new a custom script.
  2. You can now edit your scriptscript - The source code that will be executed by NodeJS in the cloud, using Puppeteers, Buster library, and more. in our online editor or using our SDK.
"phantombuster command: nodejs"
"phantombuster package: 5"
"phantombuster flags: save-folder"

/**
 * The upper part is phantombuster configuration
 * Check out our guide on writing custom scripts
 * @see: https://hub.phantombuster.com/docs/script-directives
 * If you would like to know more about script directives
 */

const Buster = require("phantombuster")
const puppeteer = require("puppeteer")

const buster = new Buster()

const scrapePreciousData = () => {
    // we are in the browser's scope there
    const data = []
    document.querySelectorAll("div.person > div.panel-body").forEach((person) => {
        data.push({
            name: person.querySelector(".name").innerText.trim(),
            birthYear: person.querySelector(".birth_year").innerText.trim(),
            gender: person.querySelector(".gender").innerText.trim(),
            job: person.querySelector(".job").innerText.trim(),
        })
    })
    return data
}

;(async () => {
    const browser = await puppeteer.launch({
        // This is needed to run Puppeteer in a Phantombuster container
        args: ["--no-sandbox"]
    })

    // Go to the page and wait for it to load
    const page = await browser.newPage()
    await page.goto("http://scraping-challenges.phantombuster.com/login")
    await page.waitForSelector("form")

    // fill and submit the login form
    await page.type("#email", "[email protected]")
    await page.type("#password", "johnjohn")
    await page.click("button[type=\"submit\"]")

    // wait for the "private" data to appear!
    await page.waitForSelector(".panel-body")

    // take a screenshot to make sure everything is fine
    await page.screenshot({ path: "screenshot.png", fullPage: true })

    // scrape all the precious data
    const preciousData = await page.evaluate(scrapePreciousData)

    // and save it to your result object
    await buster.setResultObject(preciousData)

    // clean up everything
    await page.close()
    await browser.close()
    process.exit()
})()

References:

Watch your agent do its job

Launch it and enjoy your data!

📘

Retrieve the data

💡We used buster.setResultObject to store the result object. We can now retrieve it using Phantombuster's API.

Updated 3 months ago


Log in to a website


A simple guide to understand how to log in to any website with Phantombuster.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.