How would you accomplish this?

View: New views
7 Messages — Rating Filter:   Alert me  

How would you accomplish this?

by George George-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Given an array of arrays for example
[["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
["C", "F"], ["C", "E"], ["G"]]

what would be a nice way of "merging" the items that seem to already be
contained in another array. e.g.
["A","B"] and ["B", "D"] are subsets of ["A", "B", "D", "C"] I would
like to drop the subsets  and be left with ["A", "B", "D", "C"] only.

My approach was to use the set module in Ruby, and look for set
membership using a pairwise comparison.
I also thought maybe i should look for the largest set. and then check
whether the rest of the sets are subsets of this set. if they are, then
delete them.

However i thought a better solution, or strategy may already exist.
How would you accomplish this?
--
Posted via http://www.ruby-forum.com/.


Re: How would you accomplish this?

by Robert Klemme-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/11/4 George George <george.githinji@...>:

> Given an array of arrays for example
> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
> ["C", "F"], ["C", "E"], ["G"]]
>
> what would be a nice way of "merging" the items that seem to already be
> contained in another array. e.g.
> ["A","B"] and ["B", "D"] are subsets of ["A", "B", "D", "C"] I would
> like to drop the subsets  and be left with ["A", "B", "D", "C"] only.
>
> My approach was to use the set module in Ruby, and look for set
> membership using a pairwise comparison.
> I also thought maybe i should look for the largest set. and then check
> whether the rest of the sets are subsets of this set. if they are, then
> delete them.
>
> However i thought a better solution, or strategy may already exist.
> How would you accomplish this?

The approach using set size seems reasonable to reduce the number of
set comparisons.  You could do

require 'set'

arrs = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
["C", "F"], ["C", "E"], ["G"]]

result = []

arrs.sort_by {|a| -a.size}.each do |a|
  s = a.to_set
  result << s unless result.any? {|rs| rs.superset? s}
end

p result

-> [#<Set: {"B", "C", "F", "E"}>, #<Set: {"A", "B", "D", "C"}>, #<Set: {"G"}>]

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Re: How would you accomplish this?

by Jesús Gabriel y Galán :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 4, 2009 at 11:49 AM, George George
<george.githinji@...> wrote:
> Given an array of arrays for example
> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
> ["C", "F"], ["C", "E"], ["G"]]
>
> what would be a nice way of "merging" the items that seem to already be
> contained in another array. e.g.
> ["A","B"] and ["B", "D"] are subsets of ["A", "B", "D", "C"] I would
> like to drop the subsets  and be left with ["A", "B", "D", "C"] only.

This is one way:

irb(main):025:0> a = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"],
["B", "C", "F", "E"],["C", "F"], ["C", "E"], ["G"]]
=> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
"E"], ["C", "F"], ["C", "E"], ["G"]]
irb(main):026:0> a.delete_if {|x| a.any? {|y| (x != y) && ((x&y).size
== x.size)}}
=> [["A", "B", "D", "C"], ["B", "C", "F", "E"], ["G"]]

Maybe there would be a way to optimize this sorting by size and
checking smaller ones against bigger ones only.
Left as an excercise for the reader :-)

Jesus.


Re: How would you accomplish this?

by Robert Klemme-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/11/4 Jesús Gabriel y Galán <jgabrielygalan@...>:

> On Wed, Nov 4, 2009 at 11:49 AM, George George
> <george.githinji@...> wrote:
>> Given an array of arrays for example
>> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
>> ["C", "F"], ["C", "E"], ["G"]]
>>
>> what would be a nice way of "merging" the items that seem to already be
>> contained in another array. e.g.
>> ["A","B"] and ["B", "D"] are subsets of ["A", "B", "D", "C"] I would
>> like to drop the subsets  and be left with ["A", "B", "D", "C"] only.
>
> This is one way:
>
> irb(main):025:0> a = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"],
> ["B", "C", "F", "E"],["C", "F"], ["C", "E"], ["G"]]
> => [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
> "E"], ["C", "F"], ["C", "E"], ["G"]]
> irb(main):026:0> a.delete_if {|x| a.any? {|y| (x != y) && ((x&y).size
> == x.size)}}
> => [["A", "B", "D", "C"], ["B", "C", "F", "E"], ["G"]]
>
> Maybe there would be a way to optimize this sorting by size and
> checking smaller ones against bigger ones only.
> Left as an excercise for the reader :-)

Done. :-)

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/


Re: How would you accomplish this?

by Greg Barozzi :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

George George wrote:
> Given an array of arrays for example
> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
> ["C", "F"], ["C", "E"], ["G"]]
>
> what would be a nice way of "merging" the items that seem to already be
> contained in another array.

Try this:

arr = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
"E"],
["C", "F"], ["C", "E"], ["G"]]


arr = arr.flatten.sort

str = arr.to_s.squeeze

arr = str.split(//)


This takes the array of arrays, flattens it into a single array
and sorts it. Then turns it into a string so we can use the
squeeze method that eliminates duplicate characters. Then
it changes it back into an array. =)
--
Posted via http://www.ruby-forum.com/.


Re: How would you accomplish this?

by Marnen Laibow-Koser :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Greg Barozzi wrote:

> George George wrote:
>> Given an array of arrays for example
>> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
>> ["C", "F"], ["C", "E"], ["G"]]
>>
>> what would be a nice way of "merging" the items that seem to already be
>> contained in another array.
>
> Try this:
>
> arr = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
> "E"],
> ["C", "F"], ["C", "E"], ["G"]]
>
>
> arr = arr.flatten.sort
>
> str = arr.to_s.squeeze
>
> arr = str.split(//)
>
>
> This takes the array of arrays, flattens it into a single array
> and sorts it. Then turns it into a string so we can use the
> squeeze method that eliminates duplicate characters. Then
> it changes it back into an array. =)

Well, this will just yield the discrete strings -- not what the OP
asked.  But what you're talking about can be done more efficiently with
arr.flatten.uniq.sort .  No need for the intermediate string
representation.

Best,
--
Marnen Laibow-Koser
http://www.marnen.org
marnen@...
--
Posted via http://www.ruby-forum.com/.


Re: How would you accomplish this?

by Martin DeMello :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Nov 4, 2009 at 4:19 PM, George George <george.githinji@...> wrote:
> Given an array of arrays for example
> [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F", "E"],
> ["C", "F"], ["C", "E"], ["G"]]
>
> what would be a nice way of "merging" the items that seem to already be
> contained in another array. e.g.
> ["A","B"] and ["B", "D"] are subsets of ["A", "B", "D", "C"] I would
> like to drop the subsets  and be left with ["A", "B", "D", "C"] only.

Here are two ways:

require 'set'

# --------------------------------------------------------------
# method 1:

sets = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
"E"], ["C", "F"], ["C", "E"], ["G"]]

sets = sets.map {|x| x.to_set}.sort_by {|x| -x.size}

(0...(sets.length)).each do |i|
  next unless sets[i]
  ((i+1)...(sets.length)).each do |j|
    sets[j] = nil if sets[j] && sets[j].subset?(sets[i])
  end
end

sets = sets.compact.map {|x| x.to_a}
p sets

# --------------------------------------------------------------
# method 2:

sets = [["A", "B"], ["A", "B", "D", "C"], ["B", "D"], ["B", "C", "F",
"E"], ["C", "F"], ["C", "E"], ["G"]]

sets = sets.map {|x| x.to_set}.sort_by {|x| -x.size}

i = 0
while i < sets.length
  ((i+1)...(sets.length)).each do |j|
    sets[j] = nil if sets[j].subset?(sets[i])
  end
  sets.compact!
  i += 1
end

sets = sets.map {|x| x.to_a}
p sets

martin