Home
/
Blog
/
Developer Insights
/
Logging millions of requests every day and what it takes

Logging millions of requests every day and what it takes

Author
Guest Author
Calendar Icon
February 26, 2015
Timer Icon
3 min read
Share

Explore this post with:

HackerEarth's web servers handle millions of requests every day. These request logs can be analyzed to mine some really useful insights as well as metrics critical to the business, for example, the number of views per day, the number of views per sub product, most popular user navigation flow, etc.

Initial Thoughts

HackerEarth uses Django as its primary web development framework and a host of other components which have been customized for performance and scalability. During normal operations, our servers handle 80–90 requests/sec on an average and this surges to 200–250 requests/sec when multiple contests overlap in a time delta. We needed a system which could easily scale to a peak traffic of 500 requests/sec. Also, this system should add minimum processing overhead to the webservers, and the data collected needs to be stored for crunching and offline processing.

Architecture

Logging Architecture

The diagram above shows a high level architecture of our request log collection system. The solid connection lines represent the data flow between different components and the dotted lines represent the communications. The whole architecture is message based and stateless, so individual components can easily be removed/replaced without any downtime.

You can read a more detailed explanation about each component in the order of data flow.

Web Servers

On the web servers, we employ a Django Middleware that asynchronously retrieves required data for a given request and then forwards it to the Transporter Cluster servers. This is done using a thread and the middleware adds an overhead of 2 milli seconds to the Django request/response cycle.

class RequestLoggerMiddleware(object):
    """
    Logs data from requests
    """
    def process_request(self, request):
        if settings.LOCAL or settings.DEBUG:
            return None

        is_ajax = request.is_ajax()
        request.META['IS_AJAX'] = is_ajax

        before = datetime.datetime.now()

        DISALLOWED_USER_AGENTS = ["ELB-HealthChecker/1.0"]
        http_user_agent = request.environ.get('HTTP_USER_AGENT', '')

        if http_user_agent in DISALLOWED_USER_AGENTS:
            return None

        # this creates a thread which collects required data and forwards it to the transporter cluster
        run_async(log_request_async, request)
        after = datetime.datetime.now()

        log("TotalTimeTakenByMiddleware %s" % ((after - before).total_seconds()))
        return None

Transporter Cluster

The transporter cluster is an array of non-blocking Thrift servers for the sole purpose of receiving data from the web servers and routing them to any other component like MongoDB, RabbitMQ, Kafka, etc. Where a given message should be routed to is specified in the message itself from the webservers. There is only one-way communication from the webservers to the transporter servers, and this saves time spent in the acknowledgement of message reception by thrift servers. We may lose some request logs due to this, but we can afford to do so. The request logs are currently routed to the Kafka cluster. The communication between the webservers and the transporter servers takes 1–2 milli seconds on an average and can be horizontally scaled to handle an increase in load.

service DataTransporter {
    oneway void transport(1:map<string, string> message)
}

Kafka Cluster

Kafka is a high throughput distributed messaging system that supports the publish/subscribe messaging pattern. This messaging infrastructure enables us to build other pipelines that depend on this stream of request logs. Our Kafka cluster stores last 15 days' worth of logs, so we can make any new consumer that we implement start processing data 15 days back in time.

Useful reference for setting up a kafka cluster.

Pipeline Manager Server

This server manages the consumption of request log messages from the Kafka topics, storing them in MongoDB and then later moving them to Amazon S3 and Amazon Redshift. MongoDB acts merely as a staging area for the data consumed from the Kafka topics and this data is transferred to S3 at hourly intervals. Every file that is saved in S3 is loaded into Amazon Redshift, which is a data warehouse solution that can scale to petabytes of data. We use Amazon Redshift for analyzing/metrics calculation from request log data. This server works in conjunction with a RabbitMQ cluster which it uses to communicate about task completion and initiation.

Here is the script that loads data from S3 into Redshift. This script handles insertion of duplicate data first by removing any duplicate rows and then by inserting the new data.

def load_s3_delta_into_redshift(s3_delta_file_path):
    bigdata_bucket = settings.BIGDATA_S3_BUCKET

    attrs = {
        'bigdata_bucket': bigdata_bucket,
        's3_delta_file_path': s3_delta_file_path,
    }

    complete_delta_file_path = "s3://{bigdata_bucket}/{s3_delta_file_path}".format(**attrs)
    schema_file_path = "s3://{bigdata_bucket}/request_log/s3_col_schema.json".format(**attrs)

    data = {
        'AWS_ACCESS_KEY_ID': settings.AWS_ACCESS_KEY_ID,
        'AWS_SECRET_ACCESS_KEY': settings.AWS_SECRET_ACCESS_KEY,
        'LOG_FILE':  complete_delta_file_path,
        'schema_file_path': schema_file_path
    }

    S3_REDSHIFT_COPY_COMMAND = " ".join([
        "copy requestlog_staging from '{LOG_FILE}' ",
        "CREDENTIALS 'aws_access_key_id={AWS_ACCESS_KEY_ID};aws_secret_access_key={AWS_SECRET_ACCESS_KEY}'",
        "json '{schema_file_path}';"
    ]).format(**data)

    LOADDATA_COMMAND = " ".join([
        "begin transaction;",
        "create temp table if not exists requestlog_staging(like requestlog);",
        S3_REDSHIFT_COPY_COMMAND,
        'delete from requestlog using requestlog_staging where requestlog.row_id=requestlog_staging.row_id;',
        'insert into requestlog select * from requestlog_staging;',
        "drop table requestlog_staging;",
        'end transaction;'
    ])

    redshift_conn_args = {
        'host': settings.REDSHIFT_HOST,
        'port': settings.REDSHIFT_PORT,
        'username': settings.REDSHIFT_DB_USERNAME
    }

    REDSHIFT_CONNECT_CMD = 'psql -U {username} -h {host} -p {port}'.format(**redshift_conn_args)
    PSQL_LOADDATA_CMD = '%s -c "%s"' % (REDSHIFT_CONNECT_CMD, LOADDATA_COMMAND)

    returncode = subprocess.call(PSQL_LOADDATA_CMD, shell=True)
    if returncode != 0:
        raise Exception("Unable to load s3 delta file into redshift ", s3_delta_file_path)

What's next

Data is like gold for any web application. If done the right way, the insights that it can provide and the growth it can drive is amazing. There are dozens of features and insights that can be built with the requests logs, including recommendation engine, better content delivery, and improving the overall product. All of this is a step toward making HackerEarth better every day for our users.

This post was originally written for the HackerEarth Engineering blog by Praveen Kumar.

Subscribe to The HackerEarth Blog

Get expert tips, hacks, and how-tos from the world of tech recruiting to stay on top of your hiring!

Author
Guest Author
Calendar Icon
February 26, 2015
Timer Icon
3 min read
Share

Hire top tech talent with our recruitment platform

Access Free Demo
Related reads

Discover more articles

Gain insights to optimize your developer recruitment process.

Why AI Interviews Are Becoming Standard Practice in Technical Hiring

Why AI Interviews Are Becoming Standard Practice in Technical Hiring

What Engineering Leaders and Talent Teams Need to Know in 2026

Technical hiring has a throughput problem. The average senior engineer spends over 15 hours a week on candidate screening, time pulled directly from product work. Recruiters manage inconsistent evaluation standards across interviewers, scheduling bottlenecks across time zones, and drop-off rates that increase every time a candidate waits too long to hear back.

AI-powered interviews have emerged as a direct response to these operational challenges, and in 2026, they have moved from experimental to mainstream.

This is not about replacing human judgment in hiring. It is about how AI interviews fit into a well-designed technical hiring process, what research shows about their impact, and what to consider when evaluating platforms.

AI Interviews Remove the Limits of Human Screening

The most immediate value of AI-powered interviews is capacity. A single AI interviewer can screen thousands of candidates simultaneously, across time zones, without scheduling conflicts, and with consistent evaluation standards. For organizations running high-volume technical hiring or expanding globally, this eliminates the constraints imposed by human bandwidth.

Consistency is another key advantage. Human screening can vary across interviewers, days, and even times of day. AI interviews apply the same rubric to every candidate, every time. This ensures fairness and produces higher-quality data for hiring decisions downstream.

Cost savings are also significant. Automating repetitive screening through AI can reduce recruitment costs by up to 30 percent, freeing senior engineering and recruitment teams to focus on areas where human judgment adds the most value, such as final technical rounds, culture fit, and candidate closing.

What the Data Actually Tells Us

A large-scale study by Chicago Booth's Center for Applied Artificial Intelligence screened over 70,000 applicants using AI-led interviews. The results challenge the assumption that automation compromises hiring quality.

Organizations using AI interviews reported:

  • 12% more job offers extended
  • 18% more candidates starting their roles
  • 16% higher 30-day retention rates

These improvements suggest AI screening, when implemented properly, surfaces better-matched candidates without reducing quality. The structured, bias-reduced evaluation process also increases access to qualified candidates who might otherwise be filtered out.

Candidate feedback is also important. When offered a choice between a human recruiter and an AI interviewer, 78% of applicants preferred the AI. They cited fairness, efficiency, and schedule flexibility as the main reasons. Transparent AI interview processes improve candidate experience rather than harm it.

What Really Happens in an AI Interview

Modern AI interview platforms combine multiple technologies.

Natural language processing allows systems to understand responses contextually, not just match keywords. The system can probe deeper when a candidate mentions a particular solution or concept, ensuring dynamic, adaptive interviews.

For technical roles, AI platforms often include live coding environments across 30+ programming languages. These platforms assess code quality, problem-solving, efficiency, and framework familiarity. Question libraries, such as HackerEarth’s 25,000+ vetted questions, are mapped to specific skills and roles.

Some platforms use video avatar technology to simulate a more natural interaction. This reduces candidate anxiety and encourages authentic responses, producing better evaluation data.

AI systems also mask personal identifiers to prevent unconscious bias. Candidate evaluation is based solely on demonstrated ability.

Where Human Judgment Remains Essential

AI interviews handle high-volume screening and structured evaluation, but human judgment remains critical. Final decisions, culture fit assessments, and relationship-building still require human oversight.

AI complements human recruiters by allowing them to focus on high-impact decisions rather than repetitive tasks.

Bias mitigation is another consideration. Leading platforms implement diverse training datasets, bias audits, and transparent evaluation methods. Organizations should verify how vendors handle these aspects.

What to Evaluate When Selecting a Platform

Not all AI interview platforms are equal. Key criteria include:

  • Question library depth: Role-specific, vetted questions provide better assessment signals
  • Adaptive questioning: Follow-up questions based on responses reveal deeper insights
  • Proctoring and security: Real-time monitoring, AI-likeness detection, and secure browsers are essential
  • Integration with ATS: Smooth integration prevents operational friction
  • Candidate experience: Lifelike avatars and intuitive interfaces reduce drop-offs and enhance employer brand
  • Data security and compliance: Robust encryption and privacy compliance are mandatory
  • Proven enterprise adoption: Platforms used by top companies validate reliability and scalability

Getting Implementation Right

Successful AI interview deployment focuses on process design, not just software.

  • Define scope clearly: AI works best in specific stages of the hiring funnel, typically after initial applications and before final human-led rounds
  • Be transparent with candidates: Inform applicants about AI interviews to improve trust and experience
  • Correlate AI scores with outcomes: Track performance, retention, and satisfaction to refine the process
  • Invest in recruiter training: Recruiters shift from screening to interpreting AI insights and focusing on high-value interactions

So, What’s the Real Impact?

AI interviews solve measurable problems, including limited interviewer bandwidth, inconsistent evaluation, scheduling friction, and geographic constraints. Research supports their effectiveness as a scalable, structured layer that enhances screening quality without replacing human judgment.

For organizations hiring technical talent at scale in 2026, the focus is on how to implement AI-powered interviews effectively rather than whether to adopt them. The tools, evidence, and candidate acceptance are already in place. Success comes from thoughtful process design.

HackerEarth offers AI-powered technical assessments and interviews, including OnScreen, its always-on AI interview agent with lifelike avatars and end-to-end proctoring. It serves 500+ enterprise customers globally, including Walmart, Amazon, Barclays, GE, and Siemens, supporting 100+ skills, 37 programming languages, and 25,000+ vetted questions.

Introducing HackerEarth OnScreen: AI-powered interviews, around the clock

Introducing HackerEarth OnScreen: AI-powered interviews, around the clock

Tech hiring has a blind spot, and it's not the resume pile, the take-home tests, or even the interview itself. It's the gap between when a great candidate applies and when your team is available to talk to them. That gap costs you more top talent than any competitor does.

Today, HackerEarth OnScreen closes it permanently.

The real cost of scheduling friction

Most companies assume they lose candidates to better offers. The data tells a different story.

A developer weighing two opportunities almost always moves forward with the company that responded first, not the one that sent a calendar invite for Thursday. AI-generated resumes have flooded inboxes, making screening harder. Engineering teams the people best positioned to evaluate technical depth have limited hours. Recruiters are under pressure to move faster while maintaining quality.

Something had to change.

What OnScreen does

OnScreen doesn't just automate scheduling. It conducts the interview.

A candidate who applies at 11 PM gets a full interview before Monday morning through lifelike AI avatars with built-in identity verification and proctoring. The experience is a genuine two-way conversation: dynamic, adaptive, and role-calibrated. This is not a chatbot filling out a scorecard.

One enterprise customer screened more than 2,000 candidates in a single weekend with complete consistency and zero interviewer bias.

"Recruiters are under pressure more than ever. The volume of applicants has surged, AI-generated resumes have made initial screening harder, and the risk of missing the right candidate keeps climbing. OnScreen was built so that no qualified candidate is overlooked because nobody was available to interview them."
— Vikas Aditya, CEO, HackerEarth

Three capabilities, combined for the first time

In-depth interviewing that evaluates reasoning, not recall.
OnScreen conducts dynamic technical conversations that adapt to how each candidate responds. It probes the depth of knowledge, follows threads, and evaluates the quality of thinking behind each answer not just whether the answer is correct. Every interview runs on a deterministic framework: the same structure for every candidate and no panel-to-panel variation.

Integrated proctoring, built in from the start:
Enterprise-grade proctoring is woven directly into the interview flow not bolted on as an afterthought. Legitimate candidates won't notice it. The ones who shouldn't be in your pipeline will.

KYC-grade candidate verification
OnScreen brings identity verification standards from financial services into technical hiring. Proxy candidates, resume misrepresentation, and skills that don't match the application – all three gaps were closed at the source.

What hiring teams are saying

"Before OnScreen, we had no reliable way to measure candidate quality, especially with the rise of AI-generated CVs. Now, screening is far more objective. Roles that previously took much longer are now being closed within three to four weeks."
— Pawan Kuldip, Head of Human Resources, Discover Dollar Inc.

Built for everyone in the process

For engineering teams:
Fewer hours on screening calls. Senior engineers focus on final-round conversations, not first-pass filters.

For recruiters:
Pipelines that move. Candidates evaluated and scored before the week starts.

For candidates:
A consistent, skills-first experience, regardless of when they apply or where they're located.

OnScreen integrates directly into HackerEarth's existing platform alongside Hiring Challenges, Technical Assessments, and FaceCode. It extends your interviewing capacity without adding headcount.

The hiring bar just got higher. Everywhere.

Top talent expects swift, fair processes. Companies that deliver both, at scale, around the clock, will hire the engineers everyone else is still scheduling calls about.

OnScreen is now live for enterprise customers. Request access at hackerearth.com/ai/onscreen.

HackerEarth powers technical hiring at Google, Amazon, Microsoft, and 500+ global enterprises. The platform supports 10M+ developers across 1,000+ skills and 40+ programming languages.

What It Takes to Keep Gen Z Engaged and Growing at Work

What It Takes to Keep Gen Z Engaged and Growing at Work

Engaging Gen Z employees is no longer an HR checkbox. It's a competitive advantage.

Companies that get this right aren’t just filling roles. They’re building future-ready teams, deepening loyalty, and winning the talent market before competitors even realize they’re losing it.

Why Gen Z is Rewriting the Rules

Gen Z didn’t just enter the workforce. They arrived with a different operating system.

  • They’ve grown up with instant access, real-time feedback, and limitless choice. When work feels slow, rigid, or disconnected, they don’t wait it out. They move on. Retention becomes a live problem, not a future one.
  • They expect technology to be intuitive and fast, communication to be direct and low-friction, and their employer to reflect values in daily action, not just annual reports.

The consequence: Outdated systems and poor employee experiences don’t just frustrate Gen Z. They accelerate attrition.

Millennials vs Gen Z: Similar Generation, Different Expectations

These two cohorts are often grouped together. They shouldn’t be.

The distinction matters because solutions designed for Millennials often fall flat for Gen Z. Understanding who you’re designing for is where effective engagement strategy begins.

Gen Z’s Relationship with Loyalty

Loyalty, for Gen Z, is earned, not assumed.

  • They challenge outdated processes and push for tech-enabled workflows.
  • They constantly evaluate whether their current role offers the growth, flexibility, and purpose they need. If it doesn’t, they start looking elsewhere.

Key insight: This isn’t disloyalty. It’s clarity about what they want. Organizations that align experiences with these expectations gain a competitive edge.

  • High turnover is the cost of ignoring this.
  • Stronger teams are the reward for getting it right.

What Actually Works

1. Rethink Workplace Technology

  • Outdated tools may be invisible to older employees, but Gen Z sees them immediately.
  • Modern HR tech and collaboration platforms improve efficiency and signal investment in people.
  • Invest in tools that reduce friction and enhance daily experience, not just track performance.

2. Flexibility with Clear Accountability

  • Gen Z values autonomy, but also needs clarity to thrive.
  • Hybrid and remote models work when paired with well-defined goals and explicit ownership.
  • Focus on outcomes, not hours. Autonomy with accountability is a combination Gen Z respects.

3. Continuous Feedback, Not Annual Reviews

  • Annual performance reviews feel outdated. Gen Z expects real-time feedback loops.
  • Frequent, actionable feedback helps employees improve faster and signals that their growth matters.
  • Make feedback a weekly habit, not a twice-yearly event.

4. Make Growth Visible

  • If career paths aren’t clear, Gen Z won’t wait. They’ll look elsewhere.
  • Internal mobility, structured learning paths, and reskilling opportunities signal future potential.
  • Invest in learning and development and make career trajectories explicit.

5. Build Real Belonging

  • Inclusion must show up in daily interactions, not just company values documents.
  • Inclusive environments where diverse perspectives are genuinely sought produce better decisions and stronger engagement.
  • Gen Z quickly notices when DEI is performative. Build it into everyday interactions.

6. Connect Work to Purpose

  • Gen Z wants to see how their work matters in a direct, traceable way.
  • Linking individual roles to tangible business outcomes increases ownership and engagement.
  • Purpose-driven work isn’t a perk. It’s a retention strategy.

7. Prioritize Well-Being

  • Burnout is a performance problem before it becomes attrition.
  • Mental health support, sustainable workloads, and genuine flexibility reduce stress and sustain engagement.
  • Policies must be real in practice. Gaps erode trust.

How to Attract Gen Z from the Start

Job Descriptions That Tell the Truth

  • Generic postings don’t convert Gen Z candidates. They want specifics: remote or hybrid expectations, real growth opportunities, and culture in practice.
  • Transparent job descriptions attract better-fit candidates and reduce early attrition.

Skills Over Experience

  • Gen Z and organizations hiring them increasingly value potential over tenure.
  • Skills-based hiring opens access to a broader, more diverse talent pool and builds teams equipped for change.
  • Hire for capability and future-readiness, not just years on a resume.

The Bottom Line

Retaining Gen Z isn’t about perks. It’s about rethinking the employee experience from the ground up.

  • Flexibility without accountability fails.
  • Purpose without visibility is hollow.
  • Growth that isn’t visible or structured drives attrition faster than most organizations realize.

The payoff: When organizations combine the right technology, real flexibility, continuous feedback, visible growth paths, and genuine inclusion:

  • Gen Z doesn’t just stay. They perform at a higher level.
  • Adaptive, future-forward thinking compounds over time.

That’s what separates organizations that thrive in today’s talent market from those constantly replacing people who left for somewhere better.

Top Products

Explore HackerEarth’s top products for Hiring & Innovation

Discover powerful tools designed to streamline hiring, assess talent efficiently, and run seamless hackathons. Explore HackerEarth’s top products that help businesses innovate and grow.
Frame
Hackathons
Engage global developers through innovation
Arrow
Frame 2
Assessments
AI-driven advanced coding assessments
Arrow
Frame 3
FaceCode
Real-time code editor for effective coding interviews
Arrow
Frame 4
L & D
Tailored learning paths for continuous assessments
Arrow
Get A Free Demo