Ticket #562 (closed defect: fixed)

Opened 14 years ago

Last modified 14 years ago

Debathena restart of CUPS during upgrade sleeps for 2 minutes

Reported by: kchen Owned by:
Priority: normal Milestone: The Distant Future
Component: -- Keywords:
Cc: Fixed in version:
Upstream bug:  CUPS:3574

Description

It's non-intuitive that:

Setting up debathena-cupsys-config (1.9~ubuntu8.04) ...
Status of Common Unix Printing System: cupsd is running.

  • Stopping Common Unix Printing System: cupsd

[ OK ]

  • Starting Common Unix Printing System: cupsd

[ OK ]

just waits here for two minutes in the middle of an apt-get upgrade. I control-C'd it several times, before finally digging into the source to see what's going on.

Change History

comment:1 Changed 14 years ago by jdreed

  • Status changed from new to proposed

comment:2 Changed 14 years ago by broder

Unfortunately, it's harder to fix the underlying problem here than we might have hoped. cups-polld (which is actually responsible for executing BrowsePolling?) apparently rate-limits itself when discovering remote queues:

Lines 305-318 of scheduler/cups-polld.c in CUPS 1.4.3:

   /*                                                                                                                                                                
    * Figure out how many printers/classes we have...                                                                                                                
    */

    for (attr = ippFindAttribute(response, "printer-name", IPP_TAG_NAME),
             max_count = 0;
         attr != NULL;
         attr = ippFindNextAttribute(response, "printer-name", IPP_TAG_NAME),
             max_count ++);

    fprintf(stderr, "DEBUG: %s Found %d printers.\n", prefix, max_count);

    count     = 0;
    max_count = max_count / interval + 1;

combined with this gem (lines 424-439, ibid.

     /*                                                                                                                                                              
      * Throttle the local broadcasts as needed so that we don't                                                                                                     
      * overwhelm the local server...                                                                                                                                
      */

      count ++;
      if (count >= max_count)
      {
       /*                                                                                                                                                            
        * Sleep for a second...                                                                                                                                      
        */

        count = 0;

        sleep(1);
      }

This means that cups-polld will only send (number_of_queues_on_remote_host / value_of_BrowseInterval + 1) queues per second to the local cupsd.

Since we set BrowseInterval? to 600, there are 176 queues on the remote host, and that formula is computed with integers, that means that we will only ever discover 1 queue per second.

There are a couple of options here, none of which are mutually exclusive:

  1. Make BrowseInterval? a smaller number. So long as BrowseInterval? is less than the number of queues on cluster-printers.mit.edu, it'll take about BrowseInterval? seconds to get all the queues under that formula. So we could drop it from 600 to something like 30. But we need to coordinate with mmanley to make sure that won't host the CUPS servers unreasonably. Also, it's just kind of dumb to ping the print servers every 30 seconds to see if there are new queues.
  1. We can spin up our own instance of cups-polld with BrowseInterval? set much lower, then kill it once we've verified that we sucked down the new queues. Something like this:
Index: debian/restart-cups.sh
===================================================================
--- debian/restart-cups.sh      (revision 24561)
+++ debian/restart-cups.sh      (working copy)
@@ -26,11 +26,28 @@
                echo "Retrieving printer list, please wait..." >&2
                echo "(This may take up to 2 minutes)" >&2
                queue_count=$(lpstat -h "$browse_host" -a | wc -l)
+
+               if echo "$browse_host" | grep -q ':'; then
+                   browse_port="$(echo "$browse_host" | awk -F: '{print $2}')"
+                   browse_host="$(echo "$browse_host" | awk -F: '{print $1}')"
+               else
+                   browse_port=631
+               fi
+               start-stop-daemon --start \
+                   --chuid lp \
+                   --pidfile /var/run/debathena-cupsys-config-poll.pid \
+                   --startas /usr/lib/cups/daemon/cups-polld -- \
+                   "$browse_host" "$browse_port" 1 631
+
                timeout=0
                while [ $(lpstat -a | wc -l) -lt $queue_count ] && [ $timeout -le 120 ]; do
                    sleep 1
                    timeout=$((timeout+1))
                done
+
+               start-stop-daemon --stop \
+                   --oknodo \
+                   --pidfile /var/run/debathena-cupsys-config-poll.pid
            fi
        fi
 }
  1. Come up with and upstream a patch for cups-polld to not rate-limit itself the first time it runs. This doesn't look like it would be too hard, and we should probably do it anyway.

comment:3 Changed 14 years ago by broder

  • Upstream bug set to CUPS:3574

comment:4 Changed 14 years ago by broder

I've uploaded a slightly more robust version of (2) from above to -proposed (r24596, with a touchup in r24598)

In addition to timing out at 30 seconds maximum, in my testing it now takes about 2 seconds to download the entire queue list.

comment:5 Changed 14 years ago by broder

  • Status changed from proposed to closed
  • Resolution set to fixed

Fix was moved to production, so hopefully this shouldn't be an issue anymore.

Note: See TracTickets for help on using tickets.