我有一套套装。我想从每个原始集合中创建最多包含一个元素的所有集合。例如,如果我的原始集合集合是((x,y),(A),(1,2)),那么解决方案是:
(x)
(y)
(A)
(1)
(2)
(x,A)
(x,1)
(x,2)
(y,A)
(y,A)
(y,1)
(y,2)
(A,1)
(A,2)
(x,A,1)
(x,A,2)
(y,A,1)
(y,A,2)我使用我编写的以下代码来递归计算这个值:
# gets an array of arrays (aoa)
# returns an array of arrays with all subsets where zero or one element is
# taken from each array, e.g. in = [[a,b],[5],[X,Y,Z]], out =
# [[],[a],[b],[5],[X],[Y],[Z],[a,5],[b,5],[a,X],[a,Y],...,[b,5,Y],[b,5,Z]]
# note the order of elelemnts in each arry is immaterial (an array is
# considered an unordered set)
sub sets_aoa_to_subsets_aoa {
my $aoa = shift // confess;
if ( scalar( @{$aoa} ) == 0 ) {
return [ [] ];
}
my $a = shift @{$aoa};
my $subsets_aoa = sets_aoa_to_subsets_aoa($aoa);
my @new_subsets = ();
foreach my $subset_a ( @{$subsets_aoa} ) {
# leave subset as-is
push @new_subsets, $subset_a;
# add one element from $a
foreach my $e ( @{$a} ) {
push @new_subsets, [ $e, @{$subset_a} ];
}
}
return \@new_subsets;
}然而,我想对子集的大小添加一个限制。例如,如果我设置了max_size=2,那么最后四个解决方案将被忽略。我不能简单地生成所有的解决方案,然后过滤那些太大的解决方案,因为有时我有100多个集合,每个集合都有2-3个元素,2^100不是一个很好的数字,特别是当我只想要大小为5或更小的子集时。
发布于 2010-11-08 01:07:19
正如我所怀疑的那样,正则表达式可以解决这个问题。
具体解决方案
下面是这个问题的具体解决方案。有80个答案。
my %seen;
"xy=a=12" =~ m{
[^=]* (x|y)* [^=]*
=
[^=]* (a)* [^=]*
=
[^=]* (1|2)* [^=]*
(?{
my $size = grep { length } $1, $2, $3;
print "<$1> <$2> <$3>\n"
if $size >= 1 &&
$size <= 2 &&
! $seen{$1,$2,$3}++;
})
(*FAIL)
}x;运行该管道到cat -n,您将看到80个答案。
当然,你需要一些通用的和可扩展的东西,这样你就可以将它应用于你的100个集合的情况。创建一个通用的解决方案总是比创建一个特定的解决方案需要更长的时间,所以我将致力于这个泛化,并在它看起来不错的时候尽快回复您。
一般解决方案
以下是一般的解决方案;它不是我最漂亮的作品,但它确实有效:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
our($MIN_PICK, $MAX_PICK) = (1, 2);
our @List_of_Sets = (
[ qw[ x y ] ],
[ qw[ a ] ],
[ qw[ 1 2 ] ],
);
sub dequeue($$) {
my($leader, $body) = @_;
$body =~ s/^\s*\Q$leader\E ?//gm;
return $body;
}
################################
my $gunk = " (?&gunk) ";
my $alter_rx = join("\n\t(?&post)\n" => map {
" $gunk ( "
. join(" | " => map { quotemeta } @$_)
. " ) * $gunk "
} @List_of_Sets);
##print "ALTER_RX <\n$alter_rx\n>\n";
my $string = join(" = ", map { join(" ", @$_) } @List_of_Sets);
##print "STRING: $string\n";
my $numbers_list = join(", " => map { '$' . $_ } 1 .. @List_of_Sets);
my $numbers_bracket = join(" " => map { '<$' . $_ . '>' } 1 .. @List_of_Sets);
my $print_statement = dequeue "|QQ|" => <<"PRINT_STATEMENT";
|QQ|
|QQ| (?{
|QQ| no warnings qw(uninitialized);
|QQ| my \$size = grep { length } $numbers_list;
|QQ| print "$numbers_bracket\\n"
|QQ| if \$size >= $MIN_PICK &&
|QQ| \$size <= $MAX_PICK &&
|QQ| ! \$seen{$numbers_list}++;
|QQ| })
|QQ|
PRINT_STATEMENT
## print "PRINT $print_statement\n";
my $search_rx = do {
use re "eval";
my %seen;
qr{
^
$alter_rx
$
$print_statement
(*FAIL)
(?(DEFINE)
(?<post> = )
(?<gunk> [^=] * )
)
}x;
};
## print qq(SEARCH:\n"$string" =~ $search_rx\n);
# run, run, run!!
$string =~ $search_rx;我有点担心你希望从中抽出的可能性的数量。也许你应该把我上面概述的这个过程放在管道的另一端,这样你就可以随心所欲地阅读它,然后挂断电话,可以说,当你满足了你的需求。
我意识到这是一个相当不同寻常的解决方案;我的代码经常是这样。:)
我只是在想,您不妨让正则表达式的完全置换性质来为您做这项工作。
也许其他人会拿出Some::Abstruse::Module来为您做这项工作。你只需要权衡一下你更喜欢哪一个。
编辑:改进的易读性,处理副本和额外的最小/最大标准。
发布于 2010-11-08 02:24:13
也是一个递归的解决方案,但是将子集构建传递到最远,所以一旦达到最大大小,就可以停止。
#!/opt/perl/bin/perl
use strict;
use warnings;
use 5.010;
sub subsets
{
my ($sets, $maxSize, $subset) = @_;
$subset //= [ ];
# If we already have $maxSize elements, we're done
return ($subset) if @$subset == $maxSize;
# If we have no sets left to pick from, we're done
return ($subset) if !@$sets;
# Consider the next set
my @remainingSets = @$sets;
my $nextSet = shift(@remainingSets);
# We can choose either 0 or 1 element from this set, continue with the rest
return (subsets(\@remainingSets, $maxSize, $subset),
map { subsets(\@remainingSets, $maxSize, [@$subset, $_]) }
@$nextSet);
}
my $sets = [ [qw(x y)], [qw(A)], [qw(1 2)] ];
my @subsets = subsets($sets, 2);
foreach my $subset (@subsets) {
say '(', join(', ', @$subset), ')';
}发布于 2010-11-08 00:50:13
您可以创建一个“状态变量”来跟踪对sets_aoa_to_subsets_aoa的调用次数,然后在您的终结点条件中进行检查:
{
my $count=0;
sub sets_aoa_to_subsets_aoa {
$count++;
my ($aoa,$number_of_calls) = @_ // confess;
if ( (scalar( @{$aoa} ) == 0) or ($count == $number_or_calls)) {
return [ [] ];
}
......
}
}https://stackoverflow.com/questions/4118613
复制相似问题