一遍MR实现矩阵乘法

2020-07-28 面试 0 评论

面试回顾：
2020.07.26 抖音大数据工程师一面
时长：一个小时
面试方式：视频面

1.自我介绍
2.编程题：一遍MapReduce实现矩阵乘法AB，map端输入数据格式是(矩阵名(A或B)，行号，列号，值)。A:mn B：n*k。
行数和列数非常大，所以只能按照一行数据输入
3.Spark中RDD的一个常见问题
4.Hadoop中shuffle的详细过程
5.Hive避免数据倾斜有哪些优化手段
6.之前接触过哪些数据库，redis? Hbase? MySQL?一个有关数据库引擎的问题
7.Hive中分桶技术
8.操作系统中进程和线程的定义和区别
9.简述局部性原理内容和计算机领域的应用
10.进程间通信的方式有哪些

最主要收获：编程题和java开发风格不太一样，需要多关注大数据方向的编程题和常见思维模式
其次：计算机基础内容知道关键点即可，答出关键点就下一题了。但是方向技术栈要非常熟悉，另外编程题很重要。

矩阵乘法：
当时确实没有想到map方法和reduce方法中，什么作为key，什么作为value可以满足分布式计算的特点，还是准备不够充分，之前一直准备的是剑指或者力扣的题，这次告诉我还是要多看多写多积累分布式计算算法的思维模式，面试之后，我就在这个MapReduce实现矩阵乘法看到问题分析、设计和代码。之后也在IDEA中跑了一遍代码。
还是警醒自己，多看多想多积累。

AMMapper:
package ArrayMultiply;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;


public class AMMapper extends Mapper<LongWritable, Text,Text, Text> {
    private String flag = null;
    //A:4*3         B:3*3
    private int m = 4;//矩阵A的行数
    private int p = 3;//矩阵B的列数


    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] fields = value.toString().split(",");
        if(fields[0].equals("a")){
            for(int i = 1;i<=p;i++){
                context.write(new Text(fields[1] + "," + i),new Text("a," + fields[2] + "," +fields[3]));
            }
        }else if(fields[0].equals("b")){
            for(int i = 1;i<=m;i++){
                context.write(new Text(i + "," + fields[2]),new Text("b,"+ fields[1] + "," + fields[3]));
            }
        }
    }
}


AMReducer:
package ArrayMultiply;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;

public class AMReducer extends Reducer<Text, Text,Text, IntWritable> {
    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
        HashMap<String,String> mapA = new HashMap<>();
        HashMap<String,String> mapB = new HashMap<>();

        for (Text value : values) {
            String[] val = value.toString().split(",");
            if("a".equals(val[0])){
                mapA.put(val[1],val[2]);
            }else if("b".equals(val[0])){
                mapB.put(val[1],val[2]);
            }
        }

        int result = 0;
        Iterator<String> mKeys = mapA.keySet().iterator();
        while(mKeys.hasNext()){
            String mkey = mKeys.next();
            if(mapB.get(mkey) == null)
                continue;
            result += Integer.parseInt(mapA.get(mkey))
                    * Integer.parseInt(mapB.get(mkey));
        }
        context.write(key,new IntWritable(result));
    }
}

本文链接： https://www.fluffysponge.fun/2020/07/28/%E4%B8%80%E9%81%8DMR%E5%AE%9E%E7%8E%B0%E7%9F%A9%E9%98%B5%E4%B9%98%E6%B3%95/

版权声明： 本博客所有文章除特别声明外，均采用 CC BY 4.0 CN协议许可协议。转载请注明出处！

InstantCWeedStudent

个人简介。