CSV Data Download Script

January 28, 2025 · AI Conversations

User

Assistant

Handling more than 50,000 records at once can lead to memory issues, as PHP is not optimized for large data processing. To efficiently process such a large amount of data, we need to implement batch processing. Here's an updated version of your script that processes records in batches and streams the output to avoid memory overflow:

<?php

/* Template Name: Download Data */

global $wpdb;

$sqlcon = '';
$sqlj = '';

if (isset($_GET['pagen'])) {
    $pageNum = $_GET['pagen'];
} else {
    $pageNum = 1;
}

$batchSize = 5000; // Process 5000 records at a time
$offset = ($pageNum - 1) * $batchSize;

if (!empty($_GET['catex'])) {
    if ($_GET['catex'] != -1) {
        $sqlcon = " 
        AND tt.taxonomy = 'category'
        AND t.term_id = '" . $_GET['catex'] . "'";
    } else {
        $sqlcon = " 
        AND tt.taxonomy = 'category'
        AND t.term_id != '65'"; // All except plant name
    }

    $sqlj = "
    JOIN $wpdb->term_relationships tr ON (p.ID = tr.object_id)
    JOIN $wpdb->term_taxonomy tt ON (tr.term_taxonomy_id = tt.term_taxonomy_id)
    JOIN $wpdb->terms t ON (tt.term_id = t.term_id)";
}

// Set headers to stream the file
header('Content-Type: application/csv');
header('Content-Disposition: attachment; filename="post_report_data.csv";');
header('Pragma: no-cache');
header('Expires: 0');

// Open the output stream
$fp = fopen('php://output', 'w');

// Write the CSV headers
fputcsv($fp, ['Title', 'Url', 'Feature Image Url', 'Category', 'Revision Date', 'Meta Description']);

while (true) {
    $querystr = "SELECT 
        ID, post_title AS title, post_modified 
        FROM $wpdb->posts p 
        $sqlj
        WHERE p.post_type='post' AND p.post_status='publish' $sqlcon 
        ORDER BY post_date DESC 
        LIMIT $offset, $batchSize";

    $pageposts = $wpdb->get_results($querystr, OBJECT);

    if (empty($pageposts)) {
        break; // Stop if no more records
    }

    foreach ($pageposts as $post) {
        $url = wp_get_attachment_url(get_post_thumbnail_id($post->ID), 'thumbnail');
        $categories = implode(', ', array_map(function ($cat) {
            return $cat->cat_name;
        }, get_the_category($post->ID)));
        $revision_date = get_the_modified_date('Y-m-d H:i:s', $post->ID);
        $meta_desc = get_post_meta($post->ID, '_yoast_wpseo_metadesc', true);

        fputcsv($fp, [
            $post->title,
            get_permalink($post->ID),
            $url,
            $categories,
            $revision_date,
            $meta_desc
        ]);
    }

    // Increment the offset for the next batch
    $offset += $batchSize;

    // Flush the output buffer to send data to the browser
    ob_flush();
    flush();
}

// Close the output stream
fclose($fp);

exit;

### Key Changes and Optimizations:
1. **Batch Processing**:
– Fetches and processes records in batches of 5000.
– Prevents memory exhaustion by processing only a small number of records at a time.

2. **Streaming Output**:
– Data is written directly to the output stream (`php://output`) to avoid memory issues.

3. **Header Setup**:
– Ensures the browser treats the response as a downloadable CSV file.

4. **Efficient Memory Management**:
– By flushing the output buffer after each batch, the script ensures smooth execution and avoids PHP memory limits.

5. **Pagination or Filtering**:
– You can control the batch size via `$batchSize` or modify the query parameters to target specific subsets of data.

This script should handle large datasets efficiently, even with over 50,000 records.

CSV Data Download Script

Leave a Comment Cancel