我有流的流(这种格式不是我设置,不能改变).例如
Stream<String> doc1 = Stream.of("how","are","you","doing","doing"); Stream<String> doc2 = Stream.of("what","what","upto"); Stream<String> doc3 = Stream.of("how","how"); Stream<Stream<String>> docs = Stream.of(doc1,doc2,doc3);
我试图将其转换为Map< String,Multiset< Integer>>的结构. (或其对应的流,因为我想进一步处理),其中密钥字符串是单词本身,并且Multiset< Integer>表示每个文档中出现的字数(应排除0). Multiset是一个谷歌番石榴类(不是java.util.).
例如:
how -> {1,2} // because it appears once in doc1,twice in doc3 and none in doc2(so doc2's count should not be included) are -> {1,1} // once in doc1 and once in doc3 you -> {1,1} // once in doc1 and once in doc2 doing -> {3} // thrice in doc3,none in others what -> {2,1} // so on upto -> {1}
在Java 8中做什么好的方法?
我尝试使用flatMap,但内部Stream极大地限制了我的选择.
解决方法
Map<String,List<Long>> map = docs.flatMap( inner -> inner.collect( Collectors.groupingBy(Function.identity(),Collectors.counting())) .entrySet() .stream()) .collect(Collectors.groupingBy( Entry::getKey,Collectors.mapping(Entry::getValue,Collectors.toList()))); System.out.println(map); // {upto=[1],how=[1,2],doing=[3],what=[2,1],are=[1,you=[1,1]}