Author Topic: Comparison  (Read 4956 times)

Offline jonnyboy

  • Overlord
  • *****
  • Posts: 1046
  • Helpful: +90/-1
  • Lazzy Trucker
    • View Profile
    • nZEDb
Comparison
« on: May 15, 2013, 05:11:56 am »
I have heard some people say that nZEDb does not create as many releases as Newznab+, so I decided to give it a very quick test.

Comparison between Newznab+(nn+) and nZEDb. Using stock screen scripts, I backfilled a.b.classic.tv.shows to 500 days.
Initially nn+ created 2783 releases where nZEDb created only 2251. But, in nZEDb, there is a 2 hour window where releases without parts count in the headers will not be created for 2 hours past the last activity for the collection. So after the 2 hours nn+ was still 2783 where nZEDb was 3331. I waited a full 4 hours after nn+ added the last part to ensure no more releases would be created. None were.
NN+ categorized all as TV Shows and and nZEDb did all but 1. NN+ identified 480 releases and nZEDb identified 452.
After all releases were created and postprocessed, nn+ had 3884493 left in the parts table and 59128 in the binaries table, nZEDb had 0 in the parts table and 0 in binaries table. The remaining items in parts and binaries will just sit there slowing the db down for other groups.
            
So, in conclusion, nZEDb created more releases than nn+ and cleaned up after itself much better. But, nZEDb still needs more work in naming and identifying releases and I fully expect that this area will improve much faster than nn+ can improve its regexes. This was just a test on a group that has good regexes. There are many groups with none or not very good regexes.

~jonnyboy
« Last Edit: December 31, 1969, 04:00:00 pm by Guest »

Offline zombu2

  • Overlord
  • *****
  • Posts: 13
  • Helpful: +2/-0
    • View Profile
Re: Comparison
« Reply #1 on: May 18, 2013, 11:54:40 am »
Well said

Offline jonnyboy

  • Overlord
  • *****
  • Posts: 1046
  • Helpful: +90/-1
  • Lazzy Trucker
    • View Profile
    • nZEDb
Re: Comparison
« Reply #2 on: May 21, 2013, 10:45:24 am »
I was asked to do a comparison between a busier group, such as a.b.teevee. Here are the results.

Comparison between Newznab+(nn+) and nZEDb. Using stock screen scripts, I backfilled a.b.teevee to 30 days.

Initially nn+ created 6663 releases where nZEDb created only 5358. But, in nZEDb, there is a 2 hour window where releases without parts count in the headers will not be created for 2 hours past the last activity for the collection.

I am comparing valid releases that are larger than 100MB, properly categorized as TV Shows or Movies and fully post processed and with nfos are counted. It took approximately 12 hours post process all releases using the stock screen scripts. I disabled the removeCrapReleases in nZEDb.

Newznab+ created 6273 valid releases and identified and postprocessed 4601 releases.
nZEDb created 6373 valid releases and identified and postprocessed 4217 releases.

It is obvious that nn+ has much better regexes for a.b.teevee than they have for a.b.classic.tv.shows. But, nZEDb still outperformed nn+ in getting valid releases and finding nfos. Although nZEDb did create a few more releases than nn+, many were not identified during postprocessing.

queries used:
nZEDb:
Code: [Select]
select  (select count(*) from releases where size) as releases,
        (select count(*) from releases where size > (1024 * 1024 * 100)) as 'valid releases',
        (select count(*) from collections) as collections,
        (select count(*) from binaries) as binaries,
        (select count(*) from parts) as parts,
        (select count(*) from releases where (rageID > 0 or imdbID > 0) and size > (1024 * 1024 * 100)) as 'identified and postprocessed',
        (select count(*) from releases where nfostatus = 1) as 'nfos found from postprocessing',
        ((select count(*) from releases where (rageID > 0 or imdbID > 0) and size > (1024 * 1024 * 100)) /(select count(*) from releases where size > (1024 * 1024 * 100)) * 100) as 'percent of releases identified',
        (select first_record from groups where name = 'alt.binaries.teevee') as first_record;
+----------+----------------+-------------+-------+----------+------------------------------+--------------------------------+--------------------------------+--------------+
| releases | valid releases | collections | parts | binaries | identified and postprocessed | nfos found from postprocessing | percent of releases identified | first_record |
+----------+----------------+-------------+-------+----------+------------------------------+--------------------------------+--------------------------------+--------------+
|    11681 |           6373 |           0 | 81552 |     2406 |                         4217 |                           5227 |                        66.1698 |    500978822 |
+----------+----------------+-------------+-------+----------+------------------------------+--------------------------------+--------------------------------+--------------+

nn+:
Code: [Select]
select  (select count(*) from releases where size) as releases,
        (select count(*) from releases where size > (1024 * 1024 * 100)) as 'valid releases',
        (select count(*) from binaries) as binaries,
        (select count(*) from parts) as parts,
        (select count(*) from releases where (rageID > 0 or imdbID > 0) and size > (1024 * 1024 * 100)) as 'identified and postprocessed',
        (select count(*) from releases where releasenfoID > 0) as 'nfos found from postprocessing',
        ((select count(*) from releases where (rageID > 0 or imdbID > 0) and size > (1024 * 1024 * 100)) /(select count(*) from releases where size > (1024 * 1024 * 100)) * 100) as 'percent of releases identified',
        (select first_record from groups where name = 'alt.binaries.teevee') as first_record;
+----------+----------------+--------+----------+------------------------------+--------------------------------+--------------------------------+--------------+
| releases | valid releases | parts  | binaries | identified and postprocessed | nfos found from postprocessing | percent of releases identified | first_record |
+----------+----------------+--------+----------+------------------------------+--------------------------------+--------------------------------+--------------+
|     6659 |           6273 | 153138 |     3480 |                         4601 |                           4925 |                        73.3461 |    500978822 |
+----------+----------------+--------+----------+------------------------------+--------------------------------+--------------------------------+--------------+
« Last Edit: May 21, 2013, 10:52:28 am by jonnyboy »

Offline slypknot

  • Junior Indexer
  • **
  • Posts: 17
  • Helpful: +1/-0
    • View Profile
Re: Comparison
« Reply #3 on: May 30, 2013, 06:53:03 pm »
Cheers for providing the evidence!  Anyone who is considering moving to a new project *SHOULD* put in the time to run 2 servers and evaluate on their own (as many have with mariaDB vs Percona vs mySQL).  nZeDB has obvious benefits over newznab - and who doesnt want to say PISS OFF to regex?  (Well, except us Asterisk/VoIP admins!  Ugh!)

Offline jonnyboy

  • Overlord
  • *****
  • Posts: 1046
  • Helpful: +90/-1
  • Lazzy Trucker
    • View Profile
    • nZEDb
Re: Comparison
« Reply #4 on: May 31, 2013, 02:42:16 am »
Agreed!!

Offline zombiehoffa

  • Enforcer
  • *****
  • Posts: 16
  • Helpful: +5/-0
    • View Profile
Re: Comparison
« Reply #5 on: May 31, 2013, 06:07:56 am »
slypknot: has anybody posted a comparison of mariadb vs mysql vs percona for nzedb/nn+ usage case? I'm kinda curious, I am running percona now, but have heard good things about mariadb.

Offline slypknot

  • Junior Indexer
  • **
  • Posts: 17
  • Helpful: +1/-0
    • View Profile
Re: Comparison
« Reply #6 on: June 09, 2013, 06:53:38 pm »
slypknot: has anybody posted a comparison of mariadb vs mysql vs percona for nzedb/nn+ usage case? I'm kinda curious, I am running percona now, but have heard good things about mariadb.

MariaDB and Percona have been benchmarked, though results were comparable.  Each has its own perks but primarily is influenced by the configuration applied - percona's my.cnf should be tweaked differently than maria's.  I have no graphs/charts/analytics to link to - hence I recommend you do what I did - benchmark them for your own.  It's damn good experience if nothing else.

IMHO, and from personal experience running my own benchmarks with 20GB DB's (outside of any usenet related materials + a professional DB guiding me) I found that BOTH are fantastic.  mySQL alone should be killed - and either branch should be selected.  After all my testing, I rested with Percona and passed on Maria.

May I also add that no matter which branch you select, you must build a good home for it.  Invest in SSDs (RAID0 with a single ATA managing backups outside of the RAID), 8GB memory MINIMUM... as in base line.  Want to give her some breathing room, push for 12-16... hell, RAM is cheap these days.  Processors... I cannot say enough about quad cores.  I run dual Xeons - yes, expensive as hell, but so were the tires and stereo for my damn car.  Invest and you'll be sitting back watching others struggle trying to tweak another few Mbs outta their systems.... meanwhile you'll be flexing yours and enjoy a diff perspective of the world.

<steps off the soap box>

Offline techlte

  • Newbie
  • *
  • Posts: 4
  • Helpful: +0/-0
    • View Profile
Re: Comparison
« Reply #7 on: June 21, 2013, 03:14:19 pm »
Thanks for taking the time and sharing your results