Use TypeScript
TypeScript provides highly productive development tools for JavaScript IDEs and practices, like static checking. It makes code easier to read and understand, there is a lot to gain using TypeScript over plain JavaScript.
Before diving deeper into this guide, please take some time to read the previous guides as we will only cover TypeScript-related things in this one :)
Source code
You can find the resulting repository of this guide on our Github.
Setting up our project
Phantombuster's SDK
We made an SDK to ease the development process. You should install it globally. It will be really useful to upload the source code every time a change is made in the source code.
npm install -g phantombuster-sdk
You now need to fill your Phantombuster config file (phantombuster.cson
), we will come back to this file later., for now it should look like so:
[
name: 'My first TypeScript Phantom'
apiKey: 'YOUR_API_KEY'
scripts:
]
Some useful resources:
Dependencies
We are now missing TypeScript compiler and the types we will be using in our project:
npm install --save-dev typescript @types/[email protected] @types/node
Puppeteer
Puppeteer types version should match the version provided in the Phantombuster package you are using. We wrote a list of all available packages for you to know which modules are included in the package.
We will also need types from our BusterJS agent module, they are available in our
public gists.
Lets setup TypeScript with strict tsconfig.json
file, to take the most out of TypeScript:
{
"compilerOptions": {
"moduleResolution": "node",
"target": "es5",
"module": "esNext",
"outDir": "dist",
"lib": ["dom", "es2015"],
"esModuleInterop": true,
"strict": true,
"noImplicitAny": true,
"strictNullChecks": true,
"strictFunctionTypes": true,
"alwaysStrict": true,
},
"include": ["./src"]
}
Make our script run on Phantombuster
We are ready to code our TypeScript script, let's create a new index.ts
file in the src
folder:
mkdir src
touch src/index.ts
For now lets write a really simple script, just to see it run, under src/index.ts
:
"phantombuster command: nodejs"
"phantombuster package: 5"
;(async () => {
console.log("Hello from the ether side!")
process.exit()
})()
Phantombuster can not execute TypeScript, so you need to compile it into JavaScript using tsc
. It will read our tsconfig.json where we included our src folder, and transpile the TypeScript files inside to the dist folder (outDir), keeping the exact same structure.
tsc
We now have a dist/index.js
file that corresponds to our transpiled TypeScript, let's upload it to Phantombuster using the SDK. First, let's create a mapping of Phantombuster script names to local script files. Our local file dist/index.js
will be mapped to hackerNews.js
on Phantombuster.
[
name: 'account name'
apiKey: 'YOUR_API_KEY_HERE'
scripts:
'hackerNews.js': 'dist/index.js'
]
Let's upload it!
phantombuster dist/index.js
You can now go to your catalog, click "Use this Phantom" (this is a user friendly way of saying "Create an agent from this script).
Go through the setup, you are on your agent's console. You should open your console to debug your script: open your management menu and click Toggle console
. You can now launch it and see it run live!
Automate the process
This is such a heavy process of upload, and we don't want to keep doing it every time we update something.
The new process using watcher modes is as follows:
- Open a terminal and run TypeScript in watcher mode. It will proceed to an incremental compilation each time our TypeScript code is updated.
tsc --watch
- The generated JavaScript file is now automatically synchronized with our source TypeScript. It's time to synchronize it with Phantombuster. Open a new terminal and launch the Phantombuster SDK in watch mode, it will monitor a directory on your disk for changes in your scripts. As soon as a change is detected, the script will be uploaded to your Phantombuster account. SDK: automatically publish code to Phantombuster →
phantombuster
That's it, every change you make in your source TypeScript file will update your Phantom's code!
Write a complete script
In this section, we will rewrite our script from the developer's quick start using TypeScript! We will scrape HackerNews and extract some precious data.
First let's copy the JavaScript source into our srcx/index.ts
file:
"phantombuster package: 5"
"phantombuster command: nodejs"
"phantombuster flags: save-folder"
const Buster = require("phantombuster")
const puppeteer = require("puppeteer")
const buster = new Buster()
;(async () => {
const browser = await puppeteer.launch({
// This is needed to run Puppeteer in a Phantombuster container
args: ["--no-sandbox"]
})
const page = await browser.newPage()
await page.goto("https://news.ycombinator.com")
await page.waitForSelector("#hnmain")
const hackerNewsLinks = await page.evaluate(() => {
const data = []
document.querySelectorAll("a.storylink").forEach((element) => {
data.push({
title: element.text,
url: element.getAttribute("href")
})
})
return data
})
await buster.setResultObject(hackerNewsLinks)
await page.screenshot({ path: "hacker-news.png" })
await page.close()
await browser.close()
process.exit()
})()
We copied some JavaScript into a TypeScript file, we immediately got some errors, let's fix them. I will begin with this one:
Line 21: Variable 'data' implicitly has type 'any[]' in some locations where its type cannot be determined.
data
has an implicit Array<any>
type which makes TypeScript's static checking useless. This error is raised because we set "noImplicitAny" to true in our tsconfig.json.
What information does an item of data
contains? A title and an url, both should be strings
.
In TypeScript, this type declaration can be made through an interface:
[...]
interface IHackerNewsLink {
title: string
url: string
}
[...]
We can now declare our data
variable as being an array of IHackerNewsLink
:
[...]
const data: IHackerNewsLink[] = []
[...]
TypeScript seems satisfied with our update on this error, but another one has been raised:
Type 'string | null' is not assignable to type 'string'. Type 'null' is not assignable to type 'string'. The expected type comes from property 'url' which is declared here on type 'IHackerNewsLink'.
This error is caused by getAttribute("href")
that returns string | null
, making the url
property of our result string | null
too, which is not assignable to string
.
We want to make sure that we can grab a string
value for our url property.
document.querySelectorAll
returns a list typed as NodeListOf<Element>
. Element is not as precise as we would like it to be, we know that we are querying HTMLAnchorElement
elements (see the selector's a.storylink
), and it always has a string property called href
that looks exactly like what we are looking for!
There are 3 ways around this problem:
- an ugly cast that annihilates the usefulness of TypeScript, let's not do this.
document.querySelectorAll("a.storylink").forEach((element) => {
data.push({
title: (element as HTMLAnchorElement).text,
url: (element as HTMLAnchorElement).href
})
})
- use document.querySelectorAll's generic type as follow::
document.querySelectorAll<HTMLAnchorElement>("a.storylink")
, but it is like a cast, nothing is strongly checked and your type and query could be completely uncorrelated. Don't do this either.
document.querySelectorAll<HTMLAnchorElement>("a.storylink").forEach((element) => {
data.push({
title: element.text,
url: element.href
})
})
- create a logical type guard statement checking that
element
trully is anHTMLAnchorElement
, no runtime error on this one!
document.querySelectorAll("a.storylink").forEach((element) => {
if (element instanceof HTMLAnchorElement) {
data.push({
title: element.text,
url: element.href,
})
}
})
Have you noticed that this change also fixed the last error we had? This one:
Line 25: Property 'text' does not exist on type 'Element'
Property text
might not exist on type Element
, but it surely does on type HTMLAnchorElement
!
Conclusion
Congratulations!
You just wrote your first script using TypeScript! The whole source code of this guide is available on our Github.
Updated almost 2 years ago