Connect with us

Business

Web Scraping in 2022 & Beyond

Published

on

Web scraping has been coming into the limelight in recent years due to the rising interest in data. Businesses across the globe have been eyeing automated data collection as a way to enhance their profitability and overall decision making.

We’ve sat down with the Lead of Commercial Product Owners at Oxylabs.io, Nedas Višniauskas, to talk about the future of web scraping. Few people have been as deeply involved with the industry as Nedas, which has allowed him to gain a unique perspective on how it has developed and how it will continue to do so.

What do you think has been the biggest change in web scraping over the last decade? How has Oxylabs participated in these changes?

There have been some interesting changes during the past few years. One of them, I think, has been the proliferation of increasingly sophisticated anti-bot systems. Scraping such websites at scale, in turn, becomes more difficult.

Scraping enthusiasts, of course, have their own answer to these issues, which is to develop dedicated data collection tools. These, while limiting the field of use, can bypass the anti-bot systems and they are constantly being updated for that purpose.

Another important change has been the rising popularity of JavaScript. More and more websites are using it to load critically important data dynamically, which means it’s essentially unreachable without browsers.

Headless ones, therefore, are a necessity. At the same time, that means infrastructure costs are rising as headless browsers take up much more computing power and traffic than simple HTTP requests.

Finally, ethics have been in the limelight. For example, residential proxy providers are looking for ways to inform and reward participants of the network. We ourselves took charge of building the framework for ethical acquisition, which, I believe, has played a part in the fact that there are less shady practices and more clarity among all industry participants.

To answer the second question, Oxylabs have reacted to these changes with the development of Scraper APIs. We created both dedicated and universal scrapers that can acquire publicly available data from nearly any website without issue. Additionally, all of our proxies are ethically sourced, giving our partners the much needed peace of mind when engaging in scraping.

Have you seen or noticed any particular trends in data acquisition or web scraping? Are specific data types becoming popular?

Off the cuff I’d say that the use of ecommerce and delivery data has been booming since the pandemic hit. Businesses want to (legally) spy on competitors and gain access to as much data as possible. Data types like pricing, products or delivery times are important to any competitor.

But these have always been important. Maybe I would say that external data in general has risen in importance. Outside of that, I don’t think there have been any particular trends in data types. There have been, however, changes in the entire supply chain. As I’ve mentioned, businesses only really need the data. Even then, the data is not the key – insights are.

As such, businesses at the tail-end of the chain have proliferated in recent years. Data-as-a-service aggregators, ones that collect information and sell sets of it, have been rising in popularity.

There are also some businesses that provide insights directly. While these are still few and far between, some of them have unique value propositions that I could see as worthwhile. Jungle Scout, for example, is a service that both scrapes external data and has large datasets from internal sources. As such, they can provide insights other businesses can’t.

What do you think are the biggest challenges the industry is facing currently? Are there any innovative solutions to these or other challenges on the horizon?

Bot protection has always been the greatest challenge. Scraping, you see, is a cat-and-mouse game. Websites attempt to implement anti-bot measures, such as the well-known CAPTCHA, while scraping companies attempt to continue evading them to retain access to data.

There have been great strides made in bot protection. TLS (Transport Layer Security) fingerprinting has been one such improvement. Sophisticated websites can use initial network handshakes to match them with headers. As many scraping tools manually modify the headers sent, TLS can often be mismatched, which would be a dead giveaway.

On the other hand, the deck is always slightly stacked in the favor of scraping. Most anti-bot protection features put a dent in the overall user experience. Filling in a CAPTCHA is something that detracts from that frictionless experience of the modern web we’re used to.

Some businesses use these techniques and see no issue. Others, ones highly concerned with delivering the best user experience possible, avoid using CAPTCHAs unless absolutely necessary. It’s always a tradeoff. More bot protection equals, almost always, worse UX, which leads to less revenue. But then less people are scraping your website.

Additionally, new pages with interesting data and content appear all the time. And you don’t start building a website from bot protection. It has to be functional first. So, the process of scraping is a lot easier than it could be for a long time.

Would you say that there are potential benefits in web scraping for academic research or policy-making? If so, why hasn’t the scientific or political community adopted the practice?

Academic research, quantitative in particular, is in large part based on data that doesn’t exist on the internet, yet. There could be studies, however, on internet behavior or something of the like where scraping could be immensely useful. Additionally, I think we’re not seeing such widespread adoption due to the previously mentioned barrier to entry.

Let’s imagine that there’s no previous scraping experience in some particular university. The researcher would have to build everything from the ground up, get all the deep knowledge, and the funding required just to start acquiring the data.

It doesn’t help that the research areas that benefit the most from scraping (like sociology, economics, psychology, etc.) are far removed from the coding, development, and IT in general. I think it’s more of an unfortunate, but temporary, circumstance, because web scraping providers will be able to reduce the barrier by a significant margin in the future.

When it comes to policy-making, I’m not so sure. I think that rather than making, it should be about enforcing. Governments are definitely knee-deep in web scraping for all kinds of security purposes. Businesses, on the other hand, have been using the same processes to protect themselves from counterfeits and copyright infringement. There’s an entire business vertical dedicated explicitly to brand protection.

 

Banking

Building towards an inclusive financial future

Published

on

By Catharina Eklof, CCO of IDEX Biometrics

  

From the visually impaired to displaced migrants, the unbanked, and people living with dementia – a burgeoning financial gap exists across many areas of society. In fact, as of late 2021, almost one-third of adults around the world were reported as unbanked according to the World Bank Group. That’s around 1.7 billion people – with half coming from the poorest 40% of the world’s population. Being financially excluded in this way means not having access to common financial services including savings accounts, loans, a credit rating, or even a bank account. Those who are awaiting clearance to join a country’s financial ecosystem, such as migrants, are also finding themselves left behind by the modern financial infrastructure.

As societies reliance on digital and contactless transactions over cash continues to grow, this financial gap is only set to widen. In less than 10 years, the share of Americans not using cash for payments has increased by double digits, reaching 41%. By 2031, cash payments are expected to make up only 6% of all transactions.

Fortunately, biometric smart cards can bridge this gap for people in the Global South, migrant populations, as well as those with visual or cognitive disabilities worldwide, who deserve to feel secure, included, and independent.

 

The challenges surrounding passwords

 COVID accelerated the transition from cash to contactless payments and the use of digital wallets, creating a challenge for many. By 2024, it is expected that digital wallets and cards will account for 84.5% of all e-commerce spend.

Digital transactions traditionally rely on the use of PINs that can easily be forgotten, as studies have found that we manage 100 passwords on average across various sites and services. In the US alone, consumers report relationships with more than three financial institutions and have more than four accounts per household. The challenge of password recollection is only growing. To counter rising cybersecurity threats, several countries now mandate two-factor authentication for retailers and service providers, creating further complexity.
However, organizations are responding to financial exclusion. Card provider Mastercard introduced its contactless PayPass offering, as well its Touch Card developed alongside Amjan Bank which enables the visually impaired to distinguish between their cards. Both look to provide a better customer experience for people struggling with the digital changeover. For those living with dementia, Mastercard has also partnered with Sibstar and the Alzheimer’s Society to create a specific card where limits, transactions, top-ups and notifications can be viewed and managed via a complementing app. Likewise, Turkish neo bank Papara introduced a Bluetooth debit card that provides visually impaired users with audio prompts when making payments.

 

Protecting the visually impaired

There are at least 2.2 billion visually impaired people globally. In 2019, it was found that 89% of visually impaired have been victims of fraud or have made errors when paying for goods and services. This figure comes prior to the pandemic, and the proliferation of digital transactions, suggesting an even bigger concern today.

PINs present an obvious security issue for this demographic, with others able to oversee their inputs and then manipulate them. Contactless payments go some way to solving that problem but pose the risk of fraud as there is no PIN verification below the increasing threshold amount, now at £100 in the UK, where the average annual wage is £27,756. In India, where the average annual wage is 9,45,489 rupees (roughly £9000), contactless limits are set to 5000 rupees (£48). Many accounts also require visual-based inputs to prove identity, such as CAPTCHA, proving as a barrier for the visually impaired.

Enhancing awareness on a regulatory level is key for driving change and reassuring vulnerable groups. The EU Accessibility Act is an example of how payment service providers are obliged to comply with accessibility standards. This includes making interfaces perceivable, operable, understandable, and robust, to ensure that individuals with disabilities can effectively navigate payment interfaces.

 

Paving the way with biometrics

 Including braille on cards for easy identification is a crucial step for the visually impaired. This can also be used on biometrics smart cards, with sensor textures to confirm the user has selected the correct method of transacting. Not only do these cards provide convenience and inclusivity, but they also promote ultimate security by linking a person’s identity directly to their fingerprints. This data is encrypted within the card itself, reducing any concerns surrounding fraudulent behaviour or of data being lost via a centralized breach or large-scale hack.

In this context, biometrics can be used to serve the unbanked and those currently unrecognized within national infrastructures. South America is an example of an early adopter of biometrics, turning to the solution to cope with swelling population sizes, and the challenges associated with accessing proof of identity when setting up traditional bank accounts. Meanwhile in India, pension payment fraud has dropped by 47% thanks to bypassing the need for prior credit ratings or credentials.

Liveness detection, however, which ensures the biometric sensor is reading a true biometric source (rather than a false or recreated image of one), is vital to the success of financial aid programs globally. Securing remittances through biometric authentication ensures transparency and better fund control. Directing funds to cold wallets or biometrically authenticated cards can also improve program efficiency, safeguarding the interests of individuals and communities.

Overall, the biometrics market is expected to grow to US$87.4 billion by 2028, at a CAGR of 17%. Whilst its value as a simple and secure method of transacting is growing substantially, you can’t put a price on its impact on those who have so-far fallen through the gaps of finance’s digital revolution.

Continue Reading

Business

Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months

Published

on

Written by Oliver Warren, Associate at DAI Magister

 

Investment in European deep tech has mirrored the broader decline in the technology sector; it has halved since the peak of 2021’s boom, reflecting investor preferences for ventures with lower capital expenditures and associated risks. Start-ups within the following verticals: Health and Bio, Transportation, Energy, and SaaS and AI experienced the most significant drops.

However, Dealroom data shows stark differences in funding for deep tech start-ups at the early, breakout (Series B & C), and late stages. After experiencing a modest deceleration between 2021 and 2022, early-stage deep-tech fundraisings have been surprisingly healthy, bucking the market trend, due in part to the hype surrounding Generative-AI and in Q1 2023 they received the highest infusion of capital for over a year.

However, this positive trend conceals a sharp decline in B and C round fundraises, which have seen investment activity plummet to $1 billion in Q1 2023 from a peak of $3 billion in Q1 2022. Late-stage rounds (>$100M) have also experienced massive declines, falling almost 70% from $2 billion in Q1 2022 to $634 million in Q1 2023.

 

$20bn+ worth of deep tech M&A in the next 15 months alone

While venture capital continues to show interest in the sector, the retreat of growth investors and the genuine prospect of a prolonged down cycle ahead has left growth-stage deep tech companies needing to implement stringent cost-cutting strategies to curtail expenses and extend their runways. But even those fortunate enough to have secured inflated funding rounds during the exuberant market conditions of 2021 will soon need additional investment.

Deep tech companies typically have high burn rates due to their heavy focus on research and development, requiring funding approximately every two years on average. With dwindling access to VC cheques, a non-existent IPO market, and practical limits to self-sufficiency, M&A is already emerging as a valid route to realising substantial profits for investors and founders, even if it doesn’t deliver the lofty $1bn+ valuations seen in 2021.

We’re already seeing more companies take this route. European deep tech M&A activity has rebounded to levels not seen for years and across our focus verticals, spanning Advanced Materials, Space, AI & ML, Cybersecurity, and Robotics, European M&A transactions have already rebounded to surpass 2020 levels (183 this year, annualised versus 176 in 2020), with some notable exits such as InstaDeep’s sale to BioNTech and SLM Solutions metal 3D printing business being acquired by Nikon.

In 2024, we forecast 250+ M&A deals in European deep tech, with at least 20 above $100m, making it the strongest M&A year since 2016. A key driver of this resurgence is the substantial increase in established deep tech companies across Europe, with many more companies fielding 100+ employees and sizeable, valuable engineering teams. The funding-driven growth in the size of European deep tech companies now makes many more sizeable, more strategic targets for international acquirers.

Overall, we anticipate the remainder of 2023 and 2024 will be banner years for European deep tech M&A, with potential deal value reaching $20 billion or more in the next 15 months alone.

 

 

Continue Reading

Magazine

Trending

Finance18 hours ago

Investing In Bitcoin: What You Need To Understand Before You Buy

Bitcoin—the digital currency that launched a financial revolution—is more than a trending investment. This decentralized currency, free from traditional banking...

News3 days ago

How the LEI Can Help Financial Institutions ‘Address’ a Growing Challenge in ISO 20022

The vast complexity and inconsistency of address formats globally presents significant challenges for financial institutions. In this blog, GLEIF’s Head...

Banking4 days ago

Building towards an inclusive financial future

By Catharina Eklof, CCO of IDEX Biometrics    From the visually impaired to displaced migrants, the unbanked, and people living...

Business4 days ago

Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months

Written by Oliver Warren, Associate at DAI Magister   Investment in European deep tech has mirrored the broader decline in...

Business5 days ago

Why ESG Investing Is Becoming More Important

Author: Urtė Karklienė, Sustainability Manager at Oxylabs   Environmental, social, and governance (ESG) term was first mentioned in a 2004...

Banking6 days ago

Preparing banks for digital transformation

By Joman Kwong, Strategic Solutions Manager, Financial Services at Laserfiche   Today, digital transformation is imperative for every industry. After...

Finance6 days ago

The critical tech to deliver personalised digital financial experiences 

Jay Sanderson, Senior Product Marketing Manager, Digital Experience at Progress   Providing customers with outstanding digital experiences is now a must...

Banking6 days ago

Bank-fintech partnerships can shape the future of cross-border payments

Steve Naudé, Head of Wise Platform   People and businesses are more interconnected than ever. In today’s global economy, international...

Business1 week ago

DORA Compliance in Financial Organisations: What You Need to Know

Nick Hogg, Director of Security Training, Fortra   The regulatory landscape is tightening for European banking, financial, and insurance institutions....

Business2 weeks ago

How sound investment research can revive the City of London

Author: Neil Shah, Director at Edison Group   A few months ago, leading portfolio manager Nick Train described the modern...

Finance2 weeks ago

Why Finance should stop leaving inventory to Operations – a guide for CFO’s

Matthew Bardell, Managing Director, nVentic   Traditionally, Finance is the only function within a company that really focuses on net...

Banking2 weeks ago

Vertical thinking: Why banks need to decouple their payments processing value chain

Esther Groen, Head of Payments Centre of Excellence, Icon Solutions   The traditional payments processing model for account-based payments is...

Finance2 weeks ago

Front-door, personalised delivery – why more effective last mile data integration is critical in financial services

by Martijn Groot, VP Marketing and Strategy, Alveo Financial services firms invest significantly in the acquisition and warehousing of many data sets...

Business2 weeks ago

Navigating equity markets in a high-interest rate environment

Marios Chailis, CMO, The Libertex Group   For over a decade, investors have become used to navigating equity markets in...

Business2 weeks ago

How can your office support the collaboration demands of today? 

Rob Quickenden, CTO, Cisilion Over the past decade, the office environment has evolved, with online collaboration tools becoming the norm. But...

Banking2 weeks ago

Improving CX in digital-first banking

By Nina Mack, CX Director at CTI Digital   The financial industry has undergone a seismic transformation over the past...

Business2 weeks ago

How data engineering can effectively support financial institutions

Adding efficiencies, automating processes and strengthening cybersecurity efforts: data engineering can be crucial in support scaling fintechs, says Krzysztof Michalik,...

Technology2 weeks ago

Industrial Revolutions – How AI Refactors Finance, Manufacturing & Healthcare

Author: Lori Witzel, Thought Leader Alumnus, Spotfire, a business unit of Cloud Software Group   Today, Artificial Intelligence (AI) is...

Business2 weeks ago

Beyond money: What private equity needs to bring to ventures on the African continent

By Bryan Turner, Partner, Spear Capital   If you ask an entrepreneur or even the leadership team of a larger...

Technology2 weeks ago

Will AI lead to a better business?

Article by engineer Sara A. Al-Emadi, Research Associate at Qatar Computing Research Institute (QCRI – part of Qatar Foundation), an...

Trending