Selenium Automation in Google Colab: Install Selenium Driver and Create Twitter Follow Bot Tutorial



Google Colab is a popular cloud-based platform that allows you to run Python code in a Jupyter notebook environment. While Colab is primarily used for machine learning and data analysis tasks, it can also be used for web scraping and automation tasks using the Selenium library.

In this blog post, we will walk through how to set up Selenium automation on Google Colab, including the installation process and example of how to use Selenium to create twitter automation bot.

Step 1: Install Selenium

First, you'll need to open your Google Colab notebook. Once you're in, you'll want to install Selenium by running the following command in the command prompt:

Cell 1
 !pip install selenium 

This will install the Selenium library on your Colab notebook.

Step 2: Install Chromium Browser and Driver

Since Ubuntu no longer distributes chromium-browser outside of snap, we need to add some commands to install it. Copy and paste the following commands in your Colab notebook and execute them in the same cell that starts with %%shell

Cell 2
 %%shell
# Add debian buster
cat > /etc/apt/sources.list.d/debian.list <<'EOF'
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-buster.gpg] http://deb.debian.org/debian buster main
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-buster-updates.gpg] http://deb.debian.org/debian buster-updates main
deb [arch=amd64 signed-by=/usr/share/keyrings/debian-security-buster.gpg] http://deb.debian.org/debian-security buster/updates main
EOF
# Add keys
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys DCC9EFBF77E11517
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 112695A0E562B32A

apt-key export 77E11517 | gpg --dearmour -o /usr/share/keyrings/debian-buster.gpg
apt-key export 22F3D138 | gpg --dearmour -o /usr/share/keyrings/debian-buster-updates.gpg
apt-key export E562B32A | gpg --dearmour -o /usr/share/keyrings/debian-security-buster.gpg

# Prefer debian repo for chromium* packages only
# Note the double-blank lines between entries
cat > /etc/apt/preferences.d/chromium.pref << 'EOF'
Package: *
Pin: release a=eoan
Pin-Priority: 500


Package: *
Pin: origin "deb.debian.org"
Pin-Priority: 300


Package: chromium*
Pin: origin "deb.debian.org"
Pin-Priority: 700
EOF

# Install chromium browser and driver
!apt-get update
!apt-get install chromium chromium-driver
             

Step 3: Create Selenium options

Once you have installed the necessary dependencies, you will need to create Selenium options to ensure that the webdriver does not crash upon start in your Google Colab environment. Since Google Colab is Ubuntu terminal-based without a GUI, it's important to add the following options:

Cell 3
  #create new cell and add this function and click run
def web_driver():
  options = webdriver.ChromeOptions()
  options.add_argument("--verbose") options.add_argument('--no-sandbox')
  options.add_argument('--headless') options.add_argument('--disable-gpu')
  options.add_argument("--window-size=1920, 1200")
  options.add_argument('--disable-dev-shm-usage') driver =
  webdriver.Chrome(options=options)
  return driver 

Now we can call this function to initiate our Chrome webdriver

Step 4: Automating Twitter Handle Following using Selenium

The first step is to import the required libraries. create a new cell and import all the libraries init

Cell 4
  from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By from selenium import webdriver
from time import sleep import random  

Next step is Defining driver_wait Function in new cell and run the cell to make sure its available, here we will create a function to handle necessary conditions in wait

Cell 5
 def driver_wait(my_xpath, w_time=10, wait_type=EC.presence_of_element_located, untilnot=False,
                driver=None):
    if not driver:
        driver = driver
    try:
        if type(my_xpath) == str:
            if untilnot:
                element = WebDriverWait(driver, w_time).until_not(wait_type((By.XPATH, my_xpath)))
                return element
            else:
                element = WebDriverWait(driver, w_time).until(wait_type((By.XPATH, my_xpath)))
                return element
        else:
            if untilnot:
                element = WebDriverWait(driver, w_time).until_not(wait_type(my_xpath))
                return element
            else:
                element = WebDriverWait(driver, w_time).until(wait_type(my_xpath))
                return element

    except Exception as ex:
        print('wait_ex')
        return False 

The driver_wait function is used to wait for an element to load on a web page before performing an action on it. It takes four arguments:

my_xpath: The Xpath of the element we want to wait for. w_time: The amount of time we want to wait for the element to load. wait_type: The type of expected condition we want to wait for. untilnot: A boolean value indicating whether we want to wait for the element to appear or disappear. The function returns the element if it is found on the page or False if it is not found.

Step 5: Next step is to create a function to open driver and login to twitter

Cell 6
 def get_driver():
    options = webdriver.ChromeOptions()
    #options.add_argument('--headless')
    driver = webdriver.Chrome(options=options)

    driver.get('https://twitter.com/login')

    for x in 'YourUsername':
        element = driver_wait('//input[@autocomplete="username"]', w_time=20, driver=driver)
        element.send_keys(x)
    sleep(2)
    submit = driver_wait('//span[text()="Next"]', driver=driver)
    driver.execute_script("arguments[0].click();", submit)
    sleep(5)

    for x in 'YourPassword':
        element = driver_wait('//input[@autocomplete="current-password"]', w_time=20, driver=driver)
        element.send_keys(x)
    submit = driver_wait('//span[text()="Log in"]', driver=driver)
    driver.execute_script("arguments[0].click();", submit)

    sleep(5)
    handle_to_follow = ''
    driver.get('https://twitter.com/{handle_to_follow}/followers')
    '''for i in 'Python web scraping':
        search_ = driver_wait('//input[@aria-label="Search query"]', driver=driver)
        search_.send_keys(i)
        sleep(0.2)'''
    #search_.submit()
    sleep(5)
    return driver 

The get_driver Function The get_driver function is responsible for initializing the Selenium WebDriver object and navigating to the Twitter login page. Once on the login page, the function enters the username and password to log in to the user's Twitter account.

After logging in, the function navigates to the followers page of the Twitter handle specified in the script. Note that the driver_wait function is used to wait for page elements to load before interacting with them.

Step 6: Now that we have logged into twitter lets start following the followers and making sure that exception is handled

Cell 7
  def click_follow():
    driver = get_driver()
    x= 0
    while True:
        try:
            if x==100:
                driver.refresh()
                sleep(20)
            sleep(3)
            check = driver_wait('//span[text()="Follow"]', driver=driver, w_time=20)
            if check:
                handle = driver_wait('//span[text()="Follow"]//ancestor::div[@data-testid="UserCell"][1]//a', driver=driver)
                print(handle.text)
                driver.execute_script("arguments[0].scrollIntoView();", handle)
                driver.execute_script("arguments[0].click();", handle)
                #ActionChains(driver).move_to_element(handle)
                click_ = driver_wait('//span[text()="Follow"]', driver=driver, w_time=20)
                driver.execute_script("arguments[0].scrollIntoView();", click_)
                driver.execute_script("arguments[0].click();", click_)
                sleep(random.uniform(2, 5))
                driver.back()
                x += 1
            if not check:
                driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
                sleep(random.uniform(2, 5))

        except Exception as ex:
            print(ex)
            input('Please check the error message and press enter: ') 
Handling Errors in click_follow Function

The click_follow function has a try-except block to catch any exceptions that might be thrown during its execution. If an error is encountered, the function prints the error message to the console and then prompts the user to check the error message before continuing. This allows the user to identify and fix any issues with the function's execution before resuming the process.

Cell 8
  #lets call out function to trigger the bot
if __name__ == '__main__':
  click_follow() 

Conclusion

In conclusion, the script presented above shows how to use Selenium and Python to automate the process of following a specific Twitter handle. The script uses a combination of functions to handle errors, wait for page elements to load, and interact with the Twitter website. With some modifications, the script can be customized to follow other Twitter handles or to perform other tasks on the platform.