Business
Web Scraping in 2022 & Beyond
Published
2 years agoon
By
admin
Web scraping has been coming into the limelight in recent years due to the rising interest in data. Businesses across the globe have been eyeing automated data collection as a way to enhance their profitability and overall decision making.
We’ve sat down with the Lead of Commercial Product Owners at Oxylabs.io, Nedas Višniauskas, to talk about the future of web scraping. Few people have been as deeply involved with the industry as Nedas, which has allowed him to gain a unique perspective on how it has developed and how it will continue to do so.
What do you think has been the biggest change in web scraping over the last decade? How has Oxylabs participated in these changes?
There have been some interesting changes during the past few years. One of them, I think, has been the proliferation of increasingly sophisticated anti-bot systems. Scraping such websites at scale, in turn, becomes more difficult.
Scraping enthusiasts, of course, have their own answer to these issues, which is to develop dedicated data collection tools. These, while limiting the field of use, can bypass the anti-bot systems and they are constantly being updated for that purpose.
Another important change has been the rising popularity of JavaScript. More and more websites are using it to load critically important data dynamically, which means it’s essentially unreachable without browsers.
Headless ones, therefore, are a necessity. At the same time, that means infrastructure costs are rising as headless browsers take up much more computing power and traffic than simple HTTP requests.
Finally, ethics have been in the limelight. For example, residential proxy providers are looking for ways to inform and reward participants of the network. We ourselves took charge of building the framework for ethical acquisition, which, I believe, has played a part in the fact that there are less shady practices and more clarity among all industry participants.
To answer the second question, Oxylabs have reacted to these changes with the development of Scraper APIs. We created both dedicated and universal scrapers that can acquire publicly available data from nearly any website without issue. Additionally, all of our proxies are ethically sourced, giving our partners the much needed peace of mind when engaging in scraping.
Have you seen or noticed any particular trends in data acquisition or web scraping? Are specific data types becoming popular?
Off the cuff I’d say that the use of ecommerce and delivery data has been booming since the pandemic hit. Businesses want to (legally) spy on competitors and gain access to as much data as possible. Data types like pricing, products or delivery times are important to any competitor.
But these have always been important. Maybe I would say that external data in general has risen in importance. Outside of that, I don’t think there have been any particular trends in data types. There have been, however, changes in the entire supply chain. As I’ve mentioned, businesses only really need the data. Even then, the data is not the key – insights are.
As such, businesses at the tail-end of the chain have proliferated in recent years. Data-as-a-service aggregators, ones that collect information and sell sets of it, have been rising in popularity.
There are also some businesses that provide insights directly. While these are still few and far between, some of them have unique value propositions that I could see as worthwhile. Jungle Scout, for example, is a service that both scrapes external data and has large datasets from internal sources. As such, they can provide insights other businesses can’t.
What do you think are the biggest challenges the industry is facing currently? Are there any innovative solutions to these or other challenges on the horizon?
Bot protection has always been the greatest challenge. Scraping, you see, is a cat-and-mouse game. Websites attempt to implement anti-bot measures, such as the well-known CAPTCHA, while scraping companies attempt to continue evading them to retain access to data.
There have been great strides made in bot protection. TLS (Transport Layer Security) fingerprinting has been one such improvement. Sophisticated websites can use initial network handshakes to match them with headers. As many scraping tools manually modify the headers sent, TLS can often be mismatched, which would be a dead giveaway.
On the other hand, the deck is always slightly stacked in the favor of scraping. Most anti-bot protection features put a dent in the overall user experience. Filling in a CAPTCHA is something that detracts from that frictionless experience of the modern web we’re used to.
Some businesses use these techniques and see no issue. Others, ones highly concerned with delivering the best user experience possible, avoid using CAPTCHAs unless absolutely necessary. It’s always a tradeoff. More bot protection equals, almost always, worse UX, which leads to less revenue. But then less people are scraping your website.
Additionally, new pages with interesting data and content appear all the time. And you don’t start building a website from bot protection. It has to be functional first. So, the process of scraping is a lot easier than it could be for a long time.
Would you say that there are potential benefits in web scraping for academic research or policy-making? If so, why hasn’t the scientific or political community adopted the practice?
Academic research, quantitative in particular, is in large part based on data that doesn’t exist on the internet, yet. There could be studies, however, on internet behavior or something of the like where scraping could be immensely useful. Additionally, I think we’re not seeing such widespread adoption due to the previously mentioned barrier to entry.
Let’s imagine that there’s no previous scraping experience in some particular university. The researcher would have to build everything from the ground up, get all the deep knowledge, and the funding required just to start acquiring the data.
It doesn’t help that the research areas that benefit the most from scraping (like sociology, economics, psychology, etc.) are far removed from the coding, development, and IT in general. I think it’s more of an unfortunate, but temporary, circumstance, because web scraping providers will be able to reduce the barrier by a significant margin in the future.
When it comes to policy-making, I’m not so sure. I think that rather than making, it should be about enforcing. Governments are definitely knee-deep in web scraping for all kinds of security purposes. Businesses, on the other hand, have been using the same processes to protect themselves from counterfeits and copyright infringement. There’s an entire business vertical dedicated explicitly to brand protection.
Banking
Building towards an inclusive financial future
Published
4 days agoon
September 22, 2023By
editorial
By Catharina Eklof, CCO of IDEX Biometrics
From the visually impaired to displaced migrants, the unbanked, and people living with dementia – a burgeoning financial gap exists across many areas of society. In fact, as of late 2021, almost one-third of adults around the world were reported as unbanked according to the World Bank Group. That’s around 1.7 billion people – with half coming from the poorest 40% of the world’s population. Being financially excluded in this way means not having access to common financial services including savings accounts, loans, a credit rating, or even a bank account. Those who are awaiting clearance to join a country’s financial ecosystem, such as migrants, are also finding themselves left behind by the modern financial infrastructure.
As societies reliance on digital and contactless transactions over cash continues to grow, this financial gap is only set to widen. In less than 10 years, the share of Americans not using cash for payments has increased by double digits, reaching 41%. By 2031, cash payments are expected to make up only 6% of all transactions.
Fortunately, biometric smart cards can bridge this gap for people in the Global South, migrant populations, as well as those with visual or cognitive disabilities worldwide, who deserve to feel secure, included, and independent.
The challenges surrounding passwords
COVID accelerated the transition from cash to contactless payments and the use of digital wallets, creating a challenge for many. By 2024, it is expected that digital wallets and cards will account for 84.5% of all e-commerce spend.
Digital transactions traditionally rely on the use of PINs that can easily be forgotten, as studies have found that we manage 100 passwords on average across various sites and services. In the US alone, consumers report relationships with more than three financial institutions and have more than four accounts per household. The challenge of password recollection is only growing. To counter rising cybersecurity threats, several countries now mandate two-factor authentication for retailers and service providers, creating further complexity.
However, organizations are responding to financial exclusion. Card provider Mastercard introduced its contactless PayPass offering, as well its Touch Card developed alongside Amjan Bank which enables the visually impaired to distinguish between their cards. Both look to provide a better customer experience for people struggling with the digital changeover. For those living with dementia, Mastercard has also partnered with Sibstar and the Alzheimer’s Society to create a specific card where limits, transactions, top-ups and notifications can be viewed and managed via a complementing app. Likewise, Turkish neo bank Papara introduced a Bluetooth debit card that provides visually impaired users with audio prompts when making payments.
Protecting the visually impaired
There are at least 2.2 billion visually impaired people globally. In 2019, it was found that 89% of visually impaired have been victims of fraud or have made errors when paying for goods and services. This figure comes prior to the pandemic, and the proliferation of digital transactions, suggesting an even bigger concern today.
PINs present an obvious security issue for this demographic, with others able to oversee their inputs and then manipulate them. Contactless payments go some way to solving that problem but pose the risk of fraud as there is no PIN verification below the increasing threshold amount, now at £100 in the UK, where the average annual wage is £27,756. In India, where the average annual wage is 9,45,489 rupees (roughly £9000), contactless limits are set to 5000 rupees (£48). Many accounts also require visual-based inputs to prove identity, such as CAPTCHA, proving as a barrier for the visually impaired.
Enhancing awareness on a regulatory level is key for driving change and reassuring vulnerable groups. The EU Accessibility Act is an example of how payment service providers are obliged to comply with accessibility standards. This includes making interfaces perceivable, operable, understandable, and robust, to ensure that individuals with disabilities can effectively navigate payment interfaces.
Paving the way with biometrics
Including braille on cards for easy identification is a crucial step for the visually impaired. This can also be used on biometrics smart cards, with sensor textures to confirm the user has selected the correct method of transacting. Not only do these cards provide convenience and inclusivity, but they also promote ultimate security by linking a person’s identity directly to their fingerprints. This data is encrypted within the card itself, reducing any concerns surrounding fraudulent behaviour or of data being lost via a centralized breach or large-scale hack.
In this context, biometrics can be used to serve the unbanked and those currently unrecognized within national infrastructures. South America is an example of an early adopter of biometrics, turning to the solution to cope with swelling population sizes, and the challenges associated with accessing proof of identity when setting up traditional bank accounts. Meanwhile in India, pension payment fraud has dropped by 47% thanks to bypassing the need for prior credit ratings or credentials.
Liveness detection, however, which ensures the biometric sensor is reading a true biometric source (rather than a false or recreated image of one), is vital to the success of financial aid programs globally. Securing remittances through biometric authentication ensures transparency and better fund control. Directing funds to cold wallets or biometrically authenticated cards can also improve program efficiency, safeguarding the interests of individuals and communities.
Overall, the biometrics market is expected to grow to US$87.4 billion by 2028, at a CAGR of 17%. Whilst its value as a simple and secure method of transacting is growing substantially, you can’t put a price on its impact on those who have so-far fallen through the gaps of finance’s digital revolution.
Business
Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months
Published
4 days agoon
September 22, 2023By
editorial
Written by Oliver Warren, Associate at DAI Magister
Investment in European deep tech has mirrored the broader decline in the technology sector; it has halved since the peak of 2021’s boom, reflecting investor preferences for ventures with lower capital expenditures and associated risks. Start-ups within the following verticals: Health and Bio, Transportation, Energy, and SaaS and AI experienced the most significant drops.
However, Dealroom data shows stark differences in funding for deep tech start-ups at the early, breakout (Series B & C), and late stages. After experiencing a modest deceleration between 2021 and 2022, early-stage deep-tech fundraisings have been surprisingly healthy, bucking the market trend, due in part to the hype surrounding Generative-AI and in Q1 2023 they received the highest infusion of capital for over a year.
However, this positive trend conceals a sharp decline in B and C round fundraises, which have seen investment activity plummet to $1 billion in Q1 2023 from a peak of $3 billion in Q1 2022. Late-stage rounds (>$100M) have also experienced massive declines, falling almost 70% from $2 billion in Q1 2022 to $634 million in Q1 2023.
$20bn+ worth of deep tech M&A in the next 15 months alone
While venture capital continues to show interest in the sector, the retreat of growth investors and the genuine prospect of a prolonged down cycle ahead has left growth-stage deep tech companies needing to implement stringent cost-cutting strategies to curtail expenses and extend their runways. But even those fortunate enough to have secured inflated funding rounds during the exuberant market conditions of 2021 will soon need additional investment.
Deep tech companies typically have high burn rates due to their heavy focus on research and development, requiring funding approximately every two years on average. With dwindling access to VC cheques, a non-existent IPO market, and practical limits to self-sufficiency, M&A is already emerging as a valid route to realising substantial profits for investors and founders, even if it doesn’t deliver the lofty $1bn+ valuations seen in 2021.
We’re already seeing more companies take this route. European deep tech M&A activity has rebounded to levels not seen for years and across our focus verticals, spanning Advanced Materials, Space, AI & ML, Cybersecurity, and Robotics, European M&A transactions have already rebounded to surpass 2020 levels (183 this year, annualised versus 176 in 2020), with some notable exits such as InstaDeep’s sale to BioNTech and SLM Solutions metal 3D printing business being acquired by Nikon.
In 2024, we forecast 250+ M&A deals in European deep tech, with at least 20 above $100m, making it the strongest M&A year since 2016. A key driver of this resurgence is the substantial increase in established deep tech companies across Europe, with many more companies fielding 100+ employees and sizeable, valuable engineering teams. The funding-driven growth in the size of European deep tech companies now makes many more sizeable, more strategic targets for international acquirers.
Overall, we anticipate the remainder of 2023 and 2024 will be banner years for European deep tech M&A, with potential deal value reaching $20 billion or more in the next 15 months alone.
Magazine
Trending


Investing In Bitcoin: What You Need To Understand Before You Buy
Bitcoin—the digital currency that launched a financial revolution—is more than a trending investment. This decentralized currency, free from traditional banking...
How the LEI Can Help Financial Institutions ‘Address’ a Growing Challenge in ISO 20022
The vast complexity and inconsistency of address formats globally presents significant challenges for financial institutions. In this blog, GLEIF’s Head...


Building towards an inclusive financial future
By Catharina Eklof, CCO of IDEX Biometrics From the visually impaired to displaced migrants, the unbanked, and people living...


Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months
Written by Oliver Warren, Associate at DAI Magister Investment in European deep tech has mirrored the broader decline in...


Why ESG Investing Is Becoming More Important
Author: Urtė Karklienė, Sustainability Manager at Oxylabs Environmental, social, and governance (ESG) term was first mentioned in a 2004...


Preparing banks for digital transformation
By Joman Kwong, Strategic Solutions Manager, Financial Services at Laserfiche Today, digital transformation is imperative for every industry. After...


The critical tech to deliver personalised digital financial experiences
Jay Sanderson, Senior Product Marketing Manager, Digital Experience at Progress Providing customers with outstanding digital experiences is now a must...


Bank-fintech partnerships can shape the future of cross-border payments
Steve Naudé, Head of Wise Platform People and businesses are more interconnected than ever. In today’s global economy, international...


DORA Compliance in Financial Organisations: What You Need to Know
Nick Hogg, Director of Security Training, Fortra The regulatory landscape is tightening for European banking, financial, and insurance institutions....


How sound investment research can revive the City of London
Author: Neil Shah, Director at Edison Group A few months ago, leading portfolio manager Nick Train described the modern...


Why Finance should stop leaving inventory to Operations – a guide for CFO’s
Matthew Bardell, Managing Director, nVentic Traditionally, Finance is the only function within a company that really focuses on net...


Vertical thinking: Why banks need to decouple their payments processing value chain
Esther Groen, Head of Payments Centre of Excellence, Icon Solutions The traditional payments processing model for account-based payments is...


Front-door, personalised delivery – why more effective last mile data integration is critical in financial services
by Martijn Groot, VP Marketing and Strategy, Alveo Financial services firms invest significantly in the acquisition and warehousing of many data sets...


Navigating equity markets in a high-interest rate environment
Marios Chailis, CMO, The Libertex Group For over a decade, investors have become used to navigating equity markets in...


How can your office support the collaboration demands of today?
Rob Quickenden, CTO, Cisilion Over the past decade, the office environment has evolved, with online collaboration tools becoming the norm. But...


Improving CX in digital-first banking
By Nina Mack, CX Director at CTI Digital The financial industry has undergone a seismic transformation over the past...


How data engineering can effectively support financial institutions
Adding efficiencies, automating processes and strengthening cybersecurity efforts: data engineering can be crucial in support scaling fintechs, says Krzysztof Michalik,...


Industrial Revolutions – How AI Refactors Finance, Manufacturing & Healthcare
Author: Lori Witzel, Thought Leader Alumnus, Spotfire, a business unit of Cloud Software Group Today, Artificial Intelligence (AI) is...


Beyond money: What private equity needs to bring to ventures on the African continent
By Bryan Turner, Partner, Spear Capital If you ask an entrepreneur or even the leadership team of a larger...


Will AI lead to a better business?
Article by engineer Sara A. Al-Emadi, Research Associate at Qatar Computing Research Institute (QCRI – part of Qatar Foundation), an...

Investing In Bitcoin: What You Need To Understand Before You Buy
How the LEI Can Help Financial Institutions ‘Address’ a Growing Challenge in ISO 20022

Building towards an inclusive financial future

Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months

Why ESG Investing Is Becoming More Important

Preparing banks for digital transformation

PCI DSS v.4.0 Latest Updates That You Need to Know

RBI’s MASTER DIRECTION ON DIGITAL PAYMENTS SECURITY CONTROLS

EMV® 3-D SECURE: ENABLING STRONG CUSTOMER AUTHENTICATION

HOW TO SIMPLIFY IDENTIFICATION IN THE GLOBAL DIGITAL ECONOMY WITH THE LEI

EXEGER – CHANGING THE PERCEPTION OF POWER

FUTURE FX PROMO
Trending
-
Banking4 days ago
Building towards an inclusive financial future
-
Business4 days ago
Euro deep tech M&A deal value expected to reach $20bn+ in the next 15 months
-
News3 days ago
How the LEI Can Help Financial Institutions ‘Address’ a Growing Challenge in ISO 20022
-
Finance18 hours ago
Investing In Bitcoin: What You Need To Understand Before You Buy