|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
read test data from file or databaseHi everybody,
That would also be a good candidate for the script gallery. Thanks and regards,
------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
|
|
Re: read test data from file or databaseThomas,
Filebased is the way to go for cross-process data sharing. You could lock the data row with a combination of the processNumber and the threadNumber, to make sure that each row is used uniquely for each test. Another random thought, since this feature comes up frequently: Would the console be able to distribute test data? It already has a communication layer between the agents. Maybe a new feature could allow the console to manage the data (from files or database) and distribute it to the agents (in subsets by agent/process, on request, etc.). Distribute subsets initially on start of the tests would be best I presume, "on request" might have significant impact on the sampling. Just a random thought. best regards, Marc. On Thu, Dec 4, 2008 at 3:31 PM, Thomas Falkenberg <thomas.falkenberg@...> wrote:
------------------------------------------------------------------------------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
|
|
Re: read test data from file or databaseMarc Van Giel wrote:
> Thomas, > > Filebased is the way to go for cross-process data sharing. You could > lock the data row with a combination of the processNumber and the > threadNumber, to make sure that each row is used uniquely for each test. > > Another random thought, since this feature comes up frequently: > > Would the console be able to distribute test data? It already has a > communication layer between the agents. Maybe a new feature could > allow the console to manage the data (from files or database) and > distribute it to the agents (in subsets by agent/process, on request, > etc.). Distribute subsets initially on start of the tests would be > best I presume, "on request" might have significant impact on the > sampling. Just a random thought. This should work already. Just put the files in the distribution directory, they'll be shipped out the the agent caches. There's no support for sending specific files to individual agents, but I think that would overcomplicate things anyway. There's no specific API to get the location of the cache directory, but the worker process can read the files by knowing that the cache directory is the first entry in the system path. E.g. (example modified from Cal's FAQ entry): import sys.path users = [] infile = open("%s/users.txt" % sys.path[0], "r") for line in infile.readlines(): users.append(line.split(",")) infile.close() > > > best regards, > Marc. > > On Thu, Dec 4, 2008 at 3:31 PM, Thomas Falkenberg > <thomas.falkenberg@... > <mailto:thomas.falkenberg@...>> wrote: > > > Hi everybody, > before I try to reinvent the wheel I'm asking you guys because I'm > sure you've been there. I'm looking for an efficient solution to > get test data (like "user_id") from a file or a database in order > to use them in my grinder EJB-calls and without slowing down the > agent, impacting the throughput. I guess a file would be the way > to go. It should be made sure that each id from the file is only > used once by each thread. So I'm thinking about reading the file > in the memory(Hash? Linked List?), partitioning it depending on > the number of rows and the number of threads and then serving the > data to each worker thread. But what if I use several processes or > agents? I could create data files for each process, but that's not > very comfortable. So if anyone has a working solution, I would > greatly appreciate it. > > That would also be a good candidate for the script gallery. > > Thanks and regards, > Thomas > > ------------------------------------------------------------------------------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
|
|
Re: read test data from file or databaseThat's a cool idea, Marc and Phil, about distributing and using data
files. I had a similar-but-different need, so I'll throw my solution onto the pile for consideration. In my case, I'm testing a Web Server (REST and SOAP), where the "objects" of interest are not "pages", but rather piles-of-bits, named by a 44-hex-char "objid". I need to test 16 primitive operations on these objects: CREATE, READ, UPDATE, DELETE, plus some versioning, metadata, ACL and query operations. When CREATING the objects, I throw them into a pool, storing with for object its objid, size (for later partial-READ and UPDATE ops), "type" (original object or read-only "version"/snapshot), and owner (for security/ACL purposes). When a Grinder thread wants to run a test, it: randomly picks a protocol (REST or SOAP); randomly picks an operation type (from the 16 primitives); randomly picks an object from the pool (unless it's CREATING a new object to add to the pool); randomly picks other operational info (e.g., content bytes for CREATE, byte-range for READ or UPDATE, etc.); randomly picks a server (this being a distributed/clustered/replicated system); fires off the WS request; receives a response, records the info in a log (additional to the regular Grinder logs); potentially also updates the pool data (e.g., modifying the size upon an UPDATE op); then repeats the cycle. [The random choices are not merely hard-coded uniform distribution, but rather proportionality based per-run as specified by the user (e.g., 2 CREATE : 20 READ : 5 UPDATE : 1 DELETE : 0 others), or even tweakable mean/stddev gaussian in the case of CREATION sizes, etc.] I originally implemented the obj pool as a Jython "list" data structure (really the only practical solution, given that I required random-choice and list-modification capabilities). I implemented this in 2 ways, PERWORKER and PERTHREAD. In PERWORKER mode, all threads in the Grinder worker share the same pool, so that multiple ops can be outstanding simultaneously on a given object. In PERTHREAD mode, the threads remove the objects from the pool when they're in use, so there are no conflicts. All this must be appropriately locked, of course. This list solution was fine for me, in the sense that I had no reqt for sharing amongst processes (either Workers on the same Grinder node, or across Grinder nodes), so I simply created a new list for each Worker. But the list solution wasn't scalable enough. When the pool size got close to 1M objects, the JVM/GC couldn't handle it, and eventually blew up with a heap-size exception (no matter how much I tried tweaking Java params). So I switched from a list to "BDB JE list", i.e., BDB JE indexed by ordinal (for random choices), and using the objid as a secondary index. When an object is DELETED from the pool, the last object in the BDB-list is moved into the vacated slot. (The secondary index is required for finding objects after they move this way within the BDB-list.) That solved the scalability problem, but left the potential for perf degradation on the Worker (as noted by Thomas). To fix that, what I do is "stash" (i.e., pre-fetch) objects from the BDB-based pool into Worker memory (via a dedicated stashing thread). Perf problem solved. BDB has the limitation that it can't be shared read-write amongst multiple processes (such as multiple Grinder Workers on the same Grinder node). That's OK with me, because as mentioned above I have no need to share the same object pool amongst multiple processes. (More generally for my setup, I have no need to use multiple Workers per load injector node; I always use 1 Worker, with as many threads as I need.) But if I did need to implement a shared pool, the way I'd go is to use a "true" DB, something like MySQL or Postgres (er, PostgreSQL). Porting the BDB solution to SQL would be quite easy, all Workers on all Grinder nodes could share it, and stashing would ameliorate perf concerns. And really, a DB is the only available solution when you get to the stability realm I'm in. - Walt -----Original Message----- From: Philip Aston [mailto:philip.aston@...] Sent: Friday, December 05, 2008 4:47 AM To: grinder-use Subject: Re: [Grinder-use] read test data from file or database Marc Van Giel wrote: > Thomas, > > Filebased is the way to go for cross-process data sharing. You could > lock the data row with a combination of the processNumber and the > threadNumber, to make sure that each row is used uniquely for each test. > > Another random thought, since this feature comes up frequently: > > Would the console be able to distribute test data? It already has a > communication layer between the agents. Maybe a new feature could > allow the console to manage the data (from files or database) and > distribute it to the agents (in subsets by agent/process, on request, > etc.). Distribute subsets initially on start of the tests would be > best I presume, "on request" might have significant impact on the > sampling. Just a random thought. This should work already. Just put the files in the distribution directory, they'll be shipped out the the agent caches. There's no support for sending specific files to individual agents, but I think that would overcomplicate things anyway. There's no specific API to get the location of the cache directory, but the worker process can read the files by knowing that the cache directory is the first entry in the system path. E.g. (example modified from Cal's FAQ entry): import sys.path users = [] infile = open("%s/users.txt" % sys.path[0], "r") for line in infile.readlines(): users.append(line.split(",")) infile.close() > > > best regards, > Marc. > > On Thu, Dec 4, 2008 at 3:31 PM, Thomas Falkenberg > <thomas.falkenberg@... > <mailto:thomas.falkenberg@...>> wrote: > > > Hi everybody, > before I try to reinvent the wheel I'm asking you guys because I'm > sure you've been there. I'm looking for an efficient solution to > get test data (like "user_id") from a file or a database in order > to use them in my grinder EJB-calls and without slowing down the > agent, impacting the throughput. I guess a file would be the way > to go. It should be made sure that each id from the file is only > used once by each thread. So I'm thinking about reading the file > in the memory(Hash? Linked List?), partitioning it depending on > the number of rows and the number of threads and then serving the > data to each worker thread. But what if I use several processes or > agents? I could create data files for each process, but that's not > very comfortable. So if anyone has a working solution, I would > greatly appreciate it. > > That would also be a good candidate for the script gallery. > > Thanks and regards, > Thomas > > ------------------------------------------------------------------------ ------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix. com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use ------------------------------------------------------------------------------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
|
|
Re: read test data from file or database/* Typo: In the last sentence, I meant "scalability realm". */
-----Original Message----- From: Tuvell, Walter Sent: Friday, December 05, 2008 7:47 AM To: 'grinder-use' Subject: RE: [Grinder-use] read test data from file or database That's a cool idea, Marc and Phil, about distributing and using data files. I had a similar-but-different need, so I'll throw my solution onto the pile for consideration. In my case, I'm testing a Web Server (REST and SOAP), where the "objects" of interest are not "pages", but rather piles-of-bits, named by a 44-hex-char "objid". I need to test 16 primitive operations on these objects: CREATE, READ, UPDATE, DELETE, plus some versioning, metadata, ACL and query operations. When CREATING the objects, I throw them into a pool, storing with for object its objid, size (for later partial-READ and UPDATE ops), "type" (original object or read-only "version"/snapshot), and owner (for security/ACL purposes). When a Grinder thread wants to run a test, it: randomly picks a protocol (REST or SOAP); randomly picks an operation type (from the 16 primitives); randomly picks an object from the pool (unless it's CREATING a new object to add to the pool); randomly picks other operational info (e.g., content bytes for CREATE, byte-range for READ or UPDATE, etc.); randomly picks a server (this being a distributed/clustered/replicated system); fires off the WS request; receives a response, records the info in a log (additional to the regular Grinder logs); potentially also updates the pool data (e.g., modifying the size upon an UPDATE op); then repeats the cycle. [The random choices are not merely hard-coded uniform distribution, but rather proportionality based per-run as specified by the user (e.g., 2 CREATE : 20 READ : 5 UPDATE : 1 DELETE : 0 others), or even tweakable mean/stddev gaussian in the case of CREATION sizes, etc.] I originally implemented the obj pool as a Jython "list" data structure (really the only practical solution, given that I required random-choice and list-modification capabilities). I implemented this in 2 ways, PERWORKER and PERTHREAD. In PERWORKER mode, all threads in the Grinder worker share the same pool, so that multiple ops can be outstanding simultaneously on a given object. In PERTHREAD mode, the threads remove the objects from the pool when they're in use, so there are no conflicts. All this must be appropriately locked, of course. This list solution was fine for me, in the sense that I had no reqt for sharing amongst processes (either Workers on the same Grinder node, or across Grinder nodes), so I simply created a new list for each Worker. But the list solution wasn't scalable enough. When the pool size got close to 1M objects, the JVM/GC couldn't handle it, and eventually blew up with a heap-size exception (no matter how much I tried tweaking Java params). So I switched from a list to "BDB JE list", i.e., BDB JE indexed by ordinal (for random choices), and using the objid as a secondary index. When an object is DELETED from the pool, the last object in the BDB-list is moved into the vacated slot. (The secondary index is required for finding objects after they move this way within the BDB-list.) That solved the scalability problem, but left the potential for perf degradation on the Worker (as noted by Thomas). To fix that, what I do is "stash" (i.e., pre-fetch) objects from the BDB-based pool into Worker memory (via a dedicated stashing thread). Perf problem solved. BDB has the limitation that it can't be shared read-write amongst multiple processes (such as multiple Grinder Workers on the same Grinder node). That's OK with me, because as mentioned above I have no need to share the same object pool amongst multiple processes. (More generally for my setup, I have no need to use multiple Workers per load injector node; I always use 1 Worker, with as many threads as I need.) But if I did need to implement a shared pool, the way I'd go is to use a "true" DB, something like MySQL or Postgres (er, PostgreSQL). Porting the BDB solution to SQL would be quite easy, all Workers on all Grinder nodes could share it, and stashing would ameliorate perf concerns. And really, a DB is the only available solution when you get to the stability realm I'm in. - Walt -----Original Message----- From: Philip Aston [mailto:philip.aston@...] Sent: Friday, December 05, 2008 4:47 AM To: grinder-use Subject: Re: [Grinder-use] read test data from file or database Marc Van Giel wrote: > Thomas, > > Filebased is the way to go for cross-process data sharing. You could > lock the data row with a combination of the processNumber and the > threadNumber, to make sure that each row is used uniquely for each test. > > Another random thought, since this feature comes up frequently: > > Would the console be able to distribute test data? It already has a > communication layer between the agents. Maybe a new feature could > allow the console to manage the data (from files or database) and > distribute it to the agents (in subsets by agent/process, on request, > etc.). Distribute subsets initially on start of the tests would be > best I presume, "on request" might have significant impact on the > sampling. Just a random thought. This should work already. Just put the files in the distribution directory, they'll be shipped out the the agent caches. There's no support for sending specific files to individual agents, but I think that would overcomplicate things anyway. There's no specific API to get the location of the cache directory, but the worker process can read the files by knowing that the cache directory is the first entry in the system path. E.g. (example modified from Cal's FAQ entry): import sys.path users = [] infile = open("%s/users.txt" % sys.path[0], "r") for line in infile.readlines(): users.append(line.split(",")) infile.close() > > > best regards, > Marc. > > On Thu, Dec 4, 2008 at 3:31 PM, Thomas Falkenberg > <thomas.falkenberg@... > <mailto:thomas.falkenberg@...>> wrote: > > > Hi everybody, > before I try to reinvent the wheel I'm asking you guys because I'm > sure you've been there. I'm looking for an efficient solution to > get test data (like "user_id") from a file or a database in order > to use them in my grinder EJB-calls and without slowing down the > agent, impacting the throughput. I guess a file would be the way > to go. It should be made sure that each id from the file is only > used once by each thread. So I'm thinking about reading the file > in the memory(Hash? Linked List?), partitioning it depending on > the number of rows and the number of threads and then serving the > data to each worker thread. But what if I use several processes or > agents? I could create data files for each process, but that's not > very comfortable. So if anyone has a working solution, I would > greatly appreciate it. > > That would also be a good candidate for the script gallery. > > Thanks and regards, > Thomas > > ------------------------------------------------------------------------ ------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix. com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use ------------------------------------------------------------------------------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
|
|
Re: read test data from file or databaseThanks Walter,
This sounds like a fine approach. I have considered using BDB for just this sort of thing in the past, but because of the need for sharing data between processes I ended up using the other Oracle database. :-) - Phil Tuvell_Walter@... wrote: > /* Typo: In the last sentence, I meant "scalability realm". */ > > > > -----Original Message----- > From: Tuvell, Walter > Sent: Friday, December 05, 2008 7:47 AM > To: 'grinder-use' > Subject: RE: [Grinder-use] read test data from file or database > > That's a cool idea, Marc and Phil, about distributing and using data > files. I had a similar-but-different need, so I'll throw my solution > onto the pile for consideration. > > In my case, I'm testing a Web Server (REST and SOAP), where the > "objects" of interest are not "pages", but rather piles-of-bits, named > by a 44-hex-char "objid". I need to test 16 primitive operations on > these objects: CREATE, READ, UPDATE, DELETE, plus some versioning, > metadata, ACL and query operations. > > When CREATING the objects, I throw them into a pool, storing with for > object its objid, size (for later partial-READ and UPDATE ops), "type" > (original object or read-only "version"/snapshot), and owner (for > security/ACL purposes). When a Grinder thread wants to run a test, it: > randomly picks a protocol (REST or SOAP); randomly picks an operation > type (from the 16 primitives); randomly picks an object from the pool > (unless it's CREATING a new object to add to the pool); randomly picks > other operational info (e.g., content bytes for CREATE, byte-range for > READ or UPDATE, etc.); randomly picks a server (this being a > distributed/clustered/replicated system); fires off the WS request; > receives a response, records the info in a log (additional to the > regular Grinder logs); potentially also updates the pool data (e.g., > modifying the size upon an UPDATE op); then repeats the cycle. [The > random choices are not merely hard-coded uniform distribution, but > rather proportionality based per-run as specified by the user (e.g., 2 > CREATE : 20 READ : 5 UPDATE : 1 DELETE : 0 others), or even tweakable > mean/stddev gaussian in the case of CREATION sizes, etc.] > > I originally implemented the obj pool as a Jython "list" data structure > (really the only practical solution, given that I required random-choice > and list-modification capabilities). I implemented this in 2 ways, > PERWORKER and PERTHREAD. In PERWORKER mode, all threads in the Grinder > worker share the same pool, so that multiple ops can be outstanding > simultaneously on a given object. In PERTHREAD mode, the threads remove > the objects from the pool when they're in use, so there are no > conflicts. All this must be appropriately locked, of course. > > This list solution was fine for me, in the sense that I had no reqt for > sharing amongst processes (either Workers on the same Grinder node, or > across Grinder nodes), so I simply created a new list for each Worker. > > But the list solution wasn't scalable enough. When the pool size got > close to 1M objects, the JVM/GC couldn't handle it, and eventually blew > up with a heap-size exception (no matter how much I tried tweaking Java > params). > > So I switched from a list to "BDB JE list", i.e., BDB JE indexed by > ordinal (for random choices), and using the objid as a secondary index. > When an object is DELETED from the pool, the last object in the BDB-list > is moved into the vacated slot. (The secondary index is required for > finding objects after they move this way within the BDB-list.) > > That solved the scalability problem, but left the potential for perf > degradation on the Worker (as noted by Thomas). To fix that, what I do > is "stash" (i.e., pre-fetch) objects from the BDB-based pool into Worker > memory (via a dedicated stashing thread). Perf problem solved. > > BDB has the limitation that it can't be shared read-write amongst > multiple processes (such as multiple Grinder Workers on the same Grinder > node). That's OK with me, because as mentioned above I have no need to > share the same object pool amongst multiple processes. (More generally > for my setup, I have no need to use multiple Workers per load injector > node; I always use 1 Worker, with as many threads as I need.) > > But if I did need to implement a shared pool, the way I'd go is to use a > "true" DB, something like MySQL or Postgres (er, PostgreSQL). Porting > the BDB solution to SQL would be quite easy, all Workers on all Grinder > nodes could share it, and stashing would ameliorate perf concerns. And > really, a DB is the only available solution when you get to the > stability realm I'm in. > > - Walt > > > > > -----Original Message----- > From: Philip Aston [mailto:philip.aston@...] > Sent: Friday, December 05, 2008 4:47 AM > To: grinder-use > Subject: Re: [Grinder-use] read test data from file or database > > Marc Van Giel wrote: > >> Thomas, >> >> Filebased is the way to go for cross-process data sharing. You could >> lock the data row with a combination of the processNumber and the >> threadNumber, to make sure that each row is used uniquely for each >> > test. > >> Another random thought, since this feature comes up frequently: >> >> Would the console be able to distribute test data? It already has a >> communication layer between the agents. Maybe a new feature could >> allow the console to manage the data (from files or database) and >> distribute it to the agents (in subsets by agent/process, on request, >> etc.). Distribute subsets initially on start of the tests would be >> best I presume, "on request" might have significant impact on the >> sampling. Just a random thought. >> > > This should work already. Just put the files in the distribution > directory, they'll be shipped out the the agent caches. There's no > support for sending specific files to individual agents, but I think > that would overcomplicate things anyway. > > There's no specific API to get the location of the cache directory, but > the worker process can read the files by knowing that the cache > directory is the first entry in the system path. E.g. (example modified > from Cal's FAQ entry): > > import sys.path > > users = [] > infile = open("%s/users.txt" % sys.path[0], "r") > for line in infile.readlines(): > users.append(line.split(",")) > infile.close() > > >> best regards, >> Marc. >> >> On Thu, Dec 4, 2008 at 3:31 PM, Thomas Falkenberg >> <thomas.falkenberg@... >> <mailto:thomas.falkenberg@...>> wrote: >> >> >> Hi everybody, >> before I try to reinvent the wheel I'm asking you guys because I'm >> sure you've been there. I'm looking for an efficient solution to >> get test data (like "user_id") from a file or a database in order >> to use them in my grinder EJB-calls and without slowing down the >> agent, impacting the throughput. I guess a file would be the way >> to go. It should be made sure that each id from the file is only >> used once by each thread. So I'm thinking about reading the file >> in the memory(Hash? Linked List?), partitioning it depending on >> the number of rows and the number of threads and then serving the >> data to each worker thread. But what if I use several processes or >> agents? I could create data files for each process, but that's not >> very comfortable. So if anyone has a working solution, I would >> greatly appreciate it. >> >> That would also be a good candidate for the script gallery. >> >> Thanks and regards, >> Thomas >> >> >> > > > ------------------------------------------------------------------------------ SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ _______________________________________________ grinder-use mailing list grinder-use@... https://lists.sourceforge.net/lists/listinfo/grinder-use |
| Free embeddable forum powered by Nabble | Forum Help |